PREDICTION OF DEMAND FOR PRIMARY BOND OFFERINGS USING ARTIFICIAL NEURAL NETWORKS

Purpose: Primary bond markets represent an interesting investment opportunity not only for banks, insurance companies, and other institutional investors, but also for individuals looking for capital gains. Since offered securities vary in terms of their rating, industrial classification, coupon, or maturity, demand of buyers for particular offerings often overcomes issued volume and price of given bond on secondary market consequently rises. Investors might be regarded as consumers purchasing required service according to their specific preferences at desired price. This paper aims at analysis of demand for bonds on primary market using artificial neural networks. Design/methodology: We design a multilayered feedforward neural network trained by Levenberg-Marquardt algorithm in order to estimate demand for individual bonds based on parameters of particular offerings. Outcomes obtained by artificial neural network are compared with conventional econometric methods. Findings: Our results indicate that artificial neural network significantly outperformed standard econometric techniques and on examined sample of primary bond offerings achieved considerably better performance in terms of prediction accuracy and mean squared error. Originality: We show that proposed neural network is able to successfully predict demand for primary obligation offerings based on their specifications. Moreover, we identify relevant parameters of issues which are able to considerably affect total demand for given security. Our findings might not only help investors to detect marketable securities, but also enable issuing entities to increase demand for their bonds in order to decrease their offering price.


INTRODUCTION
Public offering might be considered as an intensive effort to obtain resources on capital markets.Decision to finance corporate intentions by issuing particular securities is usually based on the financial plan and long-term strategy of the issuer.Compared to the equity offering, the main benefits of debt financing are the large amount of funds without change of the ownership structure and tax shield from paid coupons.Despite its advantages, this process is rather costly, time-consuming and complicated, therefore it must be well prepared and carefully performed.In order to optimize the public offering, issuers usually mandate large global bank or broker to lead the issue and specify offering price based on the customer calls, due diligence and industry analysis.Underwriters´ specialization in the sales and marketing of securities lower issuers´ transactional and informational costs of capital (Fang, 2005).Leading bank should establish appropriate maturity, currency, and legal domicile based on issuer needs.Expert team, which deals with the development and delivery of issue specifications must implement not only the requirements of the company, but also the demands of the main investors such as commercial banks, insurance companies, hedge funds, pension funds and others.For the issuer it is crucial that the underwriter provides optimal services so that investors cover all desired volume at the lowest possible price.Given the fact that the volume of large offerings is often in billions, even a small price change can cause a significant difference in total cost.Quality of leader services in managing the issue is not only important for the issuer who pays for it, but also for investors.If the underwriter estimates offered yield as too low, investors do not obtain corresponding value for their funds in terms of an adequate balance of return, risk and liquidity.Resulting outcome might imply that the amount of acquired funds requested by the issuer will not be fulfilled.It also may happen that due to precise marketing underwriter sells the entire issue, but investors will feel deceived and interrupt cooperation with given bank because of subsequent decline of bond price in the secondary market.On the other hand, in case of large yields investors will be satisfied, issue oversubscribed, but the issuing client will be paying higher costs until the maturity, or eventual call-date of the bond.It is therefore very important to estimate the parameters of the offering in such way that not only the client is satisfied, but also the investors who purchase the issue.
Due to their importance and complexity, public offerings have received considerable attention in academic research.Among most frequently examined topics belong under-pricing of offerings, relations with underwriter, marketing of offerings, or their allocation.Ke, Liang Liao and Hsu (2007) explored determinants of different types of bonds at the initial public offerings for the Taiwan Stock Exchange.Their analysis suggested that companies with large research and development expenses were more likely to issue straight bonds, while firms with higher future growth opportunities were more likely to issue convertible obligations.They also showed that the need for financing was the major parameter that influenced types of issued bonds in terms that firms with more significant financing needs were more likely to issue convertible bonds and vice versa.On the sample of 353 firms Davydov, Nikkinen and Vähämaa (2013) examined the relationships between company valuation and the sources of debt financing.Their results indicated that companies which offered public debt performed worse than firms with other sources of debt financing in terms of stock market valuation, i.e. their market value decreased.Findings of Altunbaş, Kara and Marqués-Ibáñez (2010) suggested that companies with higher credit level and financial leverage depended more on public debt, while more profitable firms with large market value relied more on syndicated bank loans.Hale and Santos (2008) claimed that more creditworthy companies with high demand for external funds offered their initial public obligations earlier.Since many firms have issued exchangeable debt as a popular method of financing in recent years, convertible debt offerings had also been researched by several studies (Kang and Lee, 1996;Lewis, Rogalski and Seward, 2002;Danielova, Smart, and Boquist, 2010).Dutordoir and Van de Gucht (2007) stated that stockholder reactions to convertible debt announcements were significantly less negative during hot debt windows.Moreover, they emphasized that windows were primarily utilized by companies with higher costs of attracting external funds.Altı (2005), Baker and Wurgler (2002) and Schultz (2003) focused on offering market timing and concluded that capital structure of firms was strongly related to historical market values.Interesting studies on debt offerings features had also been provided by Eckbo (1986), Spiess and Affleck-Graves (1999), and Garay and Molina (2014), while Demers andLewellen (2003), andCook, Kieschnick andVan Ness (2006) focused on benefits of marketing and promotion.
Regarding underwriter selection, several studies showed advantages of hiring a high reputation issue leader (Carter and Manaster, 1990;Wang and Yung, 2011) with strong connection to institutional investors (Chen and Wilhelm, 2008;Neupane and Thapa, 2013).Underwriter reputation had been also examined by Beckman et al. (2001), Roten and Mullineaux (2002), Loureiro (2010), Andres, Betzer andLimbach (2014), andChua (2014) stating that the selection of top-tier underwriters had significant impact on security valuation and long-term performance.McKenzie and Takaoka (2008) explored the role of the leading underwriter's reputation in defining the probability of switching of underwriters between the particular issues.They argued that the probability of a switch significantly increased if the rating of the leading underwriter of the initial issue declined.There was also an evidence that leaders who raised the degree of overpricing of the initial issue were more likely to be selected to act as the leading underwriter of the consequent offering.Krigman, Shaw and Womack (2001) stated that offerings of switching companies had been significantly less under-priced than those of non-switching companies and firms usually switched leaders mostly to graduate to higher reputation underwriter.
On the other hand, Butler, O'Connor Keefe and Kieschnick (2013) examined the statistical robustness of parameters to explain initial public offering returns.They established a list of robust variables and evaluated their implications for different theories of under-pricing and illustrated how applying a set of robust explanatory variables can lead to different conclusions.If the issue was priced exactly at its intrinsic value, large and well informed investors would completely cover the issued volume in case of lucrative deals and bear back in case of unprofitable ones.Under-pricing of offering is crucial in order to guarantee that also the uniformed investors purchase the issue (Rock, 1986).Focusing on initial public offerings, Booth and Chua (1996) argued that required returns to investors decrease with large liquidity, and Purnanandam and Swaminathan (2004) suggested that median offering was overvalued at the offer by 50% relative to its industry peers.The role of venture capital in underpricing public offerings had been explored by Lee and Wahal (2004).They questioned the role of venture capitalists in the under-pricing of public issues between 1980 and 2000 and argued that the venture funds represented an endogenous preference on the part of the venture capitalist and the entrepreneur.Venture capital backed issues registered larger first-day gain than identical non-venture backed issues.Additional interesting research on under-pricing had been done by Hanley (1993), Brennan and Franks (1997), Francis and Hasan (2001), Habib and Ljungqvist (2001), Ellul and Pagano (2006), and Zheng and Li (2008) concluding that under-pricing had direct effects on secondary market liquidity.
In the case that the issuing company decides to issue its securities globally, it is very important to be subjected to valuation of a well-known rating agency, which should provide an objective assessment of its current economic situation.Baker and Mansi (2002) compared a sample of industrial bond issuers and institutional investors on different issues according to credit ratings.Their results showed that while investors required one or two ratings, issuing companies thought that they needed more ratings.Issuers utilized multiple ratings to raise the probability of a correct evaluation to ensure the optimal interest rate.But large sophisticated investors had the ability to perform their own credit capacity analysis.Institutional investors therefore used the rating as a decision support variable, but not the exclusive criterion.The results of An and Chan (2008) indicated that offerings with credit ratings were under-priced significantly less than offerings without credit ratings.Their suggestions were consistent with the statement that credit ratings reduce the ex-ante uncertainty and information asymmetry among investors.They also argued that it was the existence of credit ratings, not the credit rating level, that reduced the under-pricing, which was consistent with the information asymmetry explanation of public offering under-pricing.
Standard econometric methods might have several limitations regarding the complexity of public offering problems.Conventional models require various assumptions of the data and variables.But public issues include many variables with unknown or ill-defined relationships.Since artificial neural networks have been successfully applied to solve nonlinear and challenging problems, Jain and Nag (1995) developed an neural network model for pricing initial public offerings.The neural network model significantly improved accuracy of prediction and reduced under-pricing costs.Robertson et al. (1998) proposed neural networks models in order to estimate the first-day return of an initial public offering.They divided the data set into technology and nontechnology offerings and constructed a regression model and two neural network models.They results indicated that neural network models performed better on both technology and nontechnology groups and overwhelmed linear regression model at predicting the first-day return of an public offering.
In this paper, we aim at analysis of demand for bonds on primary market using artificial neural networks.We utilize multi-layered feed forward neural network trained by Levenberg-Marquardt algorithm in order to estimate demand for individual bonds based on parameters of individual offerings.Furthermore, this paper contributes by focusing on conventional econometric methods in order to identify relevant characteristics of issues which are able to considerably affect the total demand for given security.The remainder of this paper is organized as follows.Section 2 describes principles of artificial neural networks and applied learning algorithm.Section 3 presents the data, and reports our empirical findings on the demand for debt offerings.Finally, Section 4 concludes the paper.

METHODOLOGY
Quality of underwriting services is crucial in the debt offering process.When companies negotiate bond financing, they choose the issue leader according to their needs and bank reputation.The highest offer price investors are willing to pay is determined not only by financial stability and credit capacity of the issuer, but also by the optimization of offering specifications which might be demanding task.Artificial neural networks are computational structures that emulate acquisition of knowledge in biological neural systems and solve stochastic, nonlinear, or ill-defined issues by applying relatively simple mathematical operations in parallel manner.They have been actively used for applications such as bankruptcy prediction, predicting costs, forecast revenue, credit scoring and more (Lee and Chen, 2005;Hayashi et al., 2010;Moosmayer et al., 2013;Tang and Chi, 2005;West, 2000).
A fundamental information-processing unit that is necessary to the functioning of every neural network is the neuron (Figure 1).Information x j at the input of synapsis j linked to neuron i is multiplied by weight w ij .The neuron sums all the inputs it receives, with each input being multiplied by affiliated weight on the particular connection.Activation function, typically sigmoid function or hyperbolic tangent, restricts the amplitude range of the neuron output to some limited value, usually from minus one to one or zero to one.

Figure 1 -An artificial neuron
Network architecture denotes the way individual neurons are connected and coordinated.Multi-layered feed forward networks involve one or more hidden layers with hidden computational neurons.By adding hidden layers, the network acquires the ability to extract high-order statistics, especially with larger input vectors (Figure 2).

Figure 2 -Multi-layered feed forward network
The output signals from the previous layer are applied as input signals to the following layer.Provided that the activation functions of the hidden neurons are nonlinear, it had been proven (Cybenko, 1989;Hornik, Stinchcombe and White, 1989) that a network with single hidden layer is able to approximate to arbitrary precision any function with finitely many discontinuities.Networks with threshold squashing function might require two hidden layers (Sontag, 1992).
The primary advantage of artificial neural networks is their ability to extract information from the data by iterative adjustments of connection weights and biases.Every performed iteration should increase its knowledge of the explored data.Based on external signals network modifies its free parameters and responds in a new way.The technique how networks update their weights and biases is called a learning algorithm.In case of supervised learning, the data is presented to the network via input and output samples and the parameters are then modified under the tension of error impulse.This impulse represents the difference between the reached and desired output of the network.Error for the neuron i is defined as ( ) ( ) ( ) , where ( ) Various learning algorithms have been proposed such as gradient descent or conjugate gradient.Contrary to standard gradient methods, second-order information about the error function surface might be beneficial for the purpose of convergence enhancement.In this paper we use algorithm presented by Levenberg (1944) and Marquardt (1963) which is very well suited to neural network learning, since it was constructed for optimization tasks that consist of sums of squares of nonlinear functions, similar to network error function.Its major advantage is that it was designed to achieve second-order information and speed without the necessity of the resolving the inversion of local Hessian matrix (see Gupta, Jin and Homma, 2003).The Levenberg-Marquardt algorithm updates weights in the following direction: where J(n) is the Jacobian matrix consisting of first derivatives of the network errors with respect to the connection weights, µ is control parameter and e(n) vector of errors.This formula is relatively simple and convenient, since Jacobian matrix is easier to handle than inversion of Hessian matrix.In case that µ is zero, algorithm becomes a Newton method with approximated Hessian matrix.On the other hand, with increasing µ, algorithm approaches to gradient descent with a small learning rate.This method balances between speed of Newton method and convergence of gradient based techniques.Its only shortcoming is the storage necessity.Since it contains matrix inversion, Levenberg-Marquardt algorithm requires a lot of computation space per each iteration.This method is therefore more suitable for middle-sized neural networks (Hagan and Menhaj, 1994).

RESULTS
The analysis in this paper aims at 945 straight USD obligations publicly issued between 2003 and 2014.Data on bonds offered by individual companies and financial institutions that include the issue volume in USD bln., coupon in %, spread over corresponding mid-swaps in basis points, rating from Moody´s, Standard & Poor´s and Fitch, and bond maturity, were taken from BondRadar based on Bloomberg database.It should be noted that variables concerning rating degree were simplified to equidistant scale, i.e. obligation with prime rating (Aaa/AAA/AAA) obtained 19 points, while companies close to default obtained 1 point (Caa3/CCC-/CCC).Dependent variable was the demand of investors as a multiple of offered volume.
Important decisions in creating the neural network are selection of number of hidden layers and number of neurons in each hidden layer.Unfortunately, there is no exact theoretical framework in the area of network topology selection.
Researchers usually experiment with number of hidden layers and neurons.It is also essential to emphasize that in case of supervised learning it is necessary to divide the data into three separate groups.The first training set is used for calculating the error signal to modify the connection weights and biases.Second smaller group is the validation set, which objective is to monitor the error during the learning progress.In the primary phase of the training should the validation error, as well as the training error, decrease rapidly.When the network starts to overfit the training data, training error still decreases, but validation error slowly increases.Network is learning patterns in presented inputs, but when it begins to lose its generalization ability, validation error increases.Stored optimal connection weights and biases are those, which produced minimal value of the validation error.The third set of data, not used during the training, is the testing set.It is used to evaluate the overall outcomes of the network.

Figure 3 -Network learning process
In order to precisely measure performance of proposed networks, we separated data into groups containing 70% (training), 15% (validation) and 15% (testing) of observations.Figure 3 demonstrate learning process monitoring all three sets of data.Minimal value of mean squared error on validation set was achieved in third epoch.
Since is are no theoretical background precisely defining required network topology, we have tested several alternatives.Table 1, 2 and 3 present the results of networks with one hidden layer involving 10, 15 and 20 hidden neurons and hyperbolic tangent as an activation function.

Figure 4 -Regression results
To compare the outcomes of proposed neural networks with conventional econometric technique, we have constructed an ordinary least squares model.Dependent variable was again the demand of investors as a multiple of offered volume and independent variables were again represented by issue volume in USD bln., coupon in %, spread over corresponding mid-swaps in basis points, rating from Moody´s, Standard & Poor´s and Fitch, and bond maturity.Since we wanted to compare out-of sample prediction ability of ordinary least squares as well, we have created two sets of data.First 80% was considered as a training sample, while following 20% was treated as a testing sample.Unfortunately, the construction of OLS does not allow to establish also the validation sample to preserve the generalization ability of the model.Table 4 presents results of ordinary least squares on the data sample of 756 observations.

Table 4 -Results of ordinary least squares
Ordinary least squares revealed that the most significant variables in estimation of investors demand for obligations issues were rating by Moody´s rating agency and maturity of given bond.Negative sign in case of MOODYS explanatory variable indicates that investors preferred issues with lower rating level.It can be explained by the fact, that offerings with lower rating are usually combined with higher yield.The same can be stated about MATURITY variable, where longterm bonds also offer higher yields.On the other hand, coefficient of VOLUME suggests that interest of investors decreased with issued volume.
Table 5 summarizes outcomes obtained by best neural network and ordinary least squares on in-sample and unseen data.The prediction for least squares was made on latter group of data containing 20% of the sample.Subsequently the estimates were compared to actual values of investors demand and both evaluation ratios were calculated.Our results suggest that neural network significantly outperformed least squares in both categories and both measures.Substantial results might be emphasized in case of out-of sample data for both mean squared error and determination coefficient.

CONCLUSION
In recent years, lot of research has been dedicated to the quality of underwriting services in terms of costs, market performance, or offer price.But on the issuer side, one of the most fundamental criterion of quality is the volume subscribed by investors.If the issue is under-subscribed, underwriter most likely did not precisely adjust the parameters such as, spread, volume, or maturity of offering.
On the other hand, if the issue is largely over-subscribed, issuer will either have to pay high interest comparing to his level of credit risk, or have to repay the funds earlier.This paper has therefore examined demand for bond offerings on primary markets using artificial neural networks.We estimated investor subscription of offered bonds regarding the issue characteristics such as total volume, coupon, maturity, credit rating and yield over corresponding mid-swaps.Moreover, we identified variables which have crucial impact on total demand.Our results show that on sample of 945 obligation issues proposed neural network significantly outperformed ordinary least squares and achieved considerably better performance in terms of prediction accuracy and mean squared error.Our findings might help underwriters to precisely specify issue parameters in order to satisfy not only their issuing client, but also the investors.
In addition, issuing entity may be able to modify the issue for the purpose of achieving the balance between its internal needs and requirement of investors to minimize the offering costs.
t n is the target output of the neuron, ( ) i y n denotes the actual output and n indicates the iteration step.Goal of the learning process is to reduce the difference between target and actual output of the network by minimizing its cost function 2

Table 1 -
Results of neural network with 10 neurons in hidden layer

Table 2 -
Results of neural network with 15 neurons in hidden layer

Table 3 -
Results of neural network with 20 neurons in hidden layer

Table 5 -
Comparison of results