Evaluating the Fitting Performance of AGARCH(1,1), NAGARCH(1,1), and VGARCH(1,1) Models

—This study compares the performance of the GARCH(1,1), AGARCH(1,1),


I. INTRODUCTION
In the financial market, volatility is one phenomenon that has the potential to lead the greater risks and uncertainties to the value of an investment, which causes the interest of the investors in the market to become unstable.According to [1], volatility of financial markets describes the fluctuations in the value of a market asset over a certain period of time.In statistics, volatility is defined to be a standard deviation of the returns (changes in the logarithmic price) [2], [3].It measures the difference in the value of the asset's return movement for a given financial time series.Therefore, the existence of volatility impacts the reality of global financial markets in relation to the risk management.
The existence of volatility for financial time series has raised the problem heteroscedasticity, which means that the volatility changes over different time periods.One popular class that can be used to model the time-varying volatility is the GARCH (Generalized Autoregressive Conditional Heteroscedasticity) proposed by [4].GARCH model has symmetrical volatility response characteristics to returns, meaning that the past positive returns (good news) and negative returns (bad news) have the same effect on the current volatility.The relationship between return and volatility is not always symmetrical but also can be asymmetric, meaning that positive and negative returns have different effects on volatility.Therefore, the asymmetric effect is essential in modeling and forecasting volatility.
Some articles have proposed several extensions and modifications of GARCH model to accommodate an asymmetric effect in volatility.This study focuses only on three asymmetric GARCH models proposed by [5], namely AGARCH (Asymmetric GARCH), NAGARCH (Nonlinear Asymmetric GARCH), and VGARCH (Vector GARCH) models.They applied these models to the TOPIX (Tokyo Price Index) data by assuming a Normal distribution for return errors.They showed that the proposed models fit data better than the GARCH(1,1) model.The first contribution of this study is to extend the models of [5] by assuming that the return errors follow four different sets of distributions: Normal, Skew-Normal (SN) of [6], Skew-Curved Normal (SCN) of [7], and Student-t.To the best of the authors' knowledge, there is no study investigating the performance of such distributions on the models of [5].
In evaluating the performance of models, this study fits the models on the buying rate of USD (US Dollars) to IDR (Indonesian Rupiah) in the daily period from January 2010 to December 2017.When we fit a model, it means that we estimate the model.Therefore, this study's second contribution is using Excel's Solver's GRG (Generalized Reduced Gradient) Non-Linear and Adaptive Random Walk Metropolis (ARWM) methods to estimate the studied model.Here, we employ the ARWM method in the Markov Chain Monte Carlo (MCMC) scheme and implement this in Scilab by writing our version.Both methods are compared to evaluate their ability to estimate the considered GARCH(1,1) models.For other GARCH-type models, [8]- [9] showed that Excel's Solver's GRG Non-Linear and ARWM methods have good ability for parameter estimation.

A. GARCH(1,1) Models
Returns are often expressed as a normal distribution and explained in terms of mean and standard deviation [10].Let R t be an asset return at time t and follows a normal distribution with the mean 0 and variance σ 2 t .The equation for the return can be expressed as follows: R t = z t , where z t ∼ N(0, σ 2 t ). (1) Study in [4] modeled the conditional variance in Eq. ( 13) as a GARCH(p, q) process in which p and q denote the lag length on returns and variances, respectively.The most popular GARCH-type model and often used in many empirical studies in financial time series is perhaps the GARCH(1,1) model.In this case, the current conditional variance is calculated based on the past weighted squared return and the past weighted variance.Mathematically, the GARCH(1,1) process is defined as in which the weighting factors are positive, requiring ω > 0 and 0 ≤ α, β < 1 to ensure the positivity of variance and 0 ≤ α + β < 1 for stationary condition.
To capture the asymmetric effect, [5] incorporated new information to the measure of volatility via the News Impact Curve which gives the relation between σ t and R t−1 .If the news impact curve of GARCH model is symmetric and centered at R t−1 = 0, the asymmetric GARCH model of [5], AGARCH, is asymmetric and centered at R t−1 = −γ.In particular, the variance process for the AGARCH(1,1) news impact curve is in which γ ∈ R represents an asymmetric parameter, and conditions for the other parameters are as in the GARCH(1,1) model.If γ ̸ = 0, the effects of positive/negative returns are asymmetric; if γ = 0, the process reduces to the GARCH(1,1) process.
Furthermore, [5] modified the AGARCH(1,1) model into the NAGARCH(1,1) and VGARCH(1,1) models.The variance process for the NAGARCH(1,1) model assumes that whereas the VGARCH(1,1) process is given by B. Distributions for Return Error 1) Skew-Normal Distribution: Azzalini in [6] introduced the SN distribution to extend the Normal distribution by incorporating a parameters λ as parameter skewness.For a random variable Z, the general form of SN probability density function with skewness λ ∈ R is given by where φ (•) denotes the normal Probability Density Function (PDF) and Φ(•) denotes the normal Cumulative Distribution Function (CDF).Therefore, the probability density function of the SN distribution for a random variable Z with zero-mean and variance σ 2 can be expressed as follows: When λ < 0, the distribution is skewed to the left; when λ > 0, the distribution is skewed to the right.So λ = 0 will reduce the SN distribution to a Normal distribution.
2) Skew-Curved Normal Distribution: Arellano-Valle et al. in [7] introduced the SCN distribution to express an asymmetrical class of Normal distribution which is different from the SN distribution.The SCN probability density function for a random variable Z with skewness λ is given by Therefore, the SCN probability density function with zeromean, variance σ 2 , and skewness λ has an expression as follows: When λ < 0, the distribution is left-skewed; when λ > 0, the distribution is right-skewed.So, λ = 0 will reduce the SCN distribution to a Normal distribution.
3) Student-t Distribution: Student-t distribution was introduced by William Sealy Gosset under the pseudonym "Student" (see [11]).The Student-t density curve is symmetrical bell-shaped like the Normal distribution but has thicker tails (often called heavy/fat tails) than the Normal distribution.Following [12], a random variable Z with zero-mean, variance σ 2 , and degrees of freedom ν > 2, the form of the Student-t probability density function is given by The tail heaviness of the Student-t distribution is determined by the parameter degrees of freedom ν.Smaller degrees of freedom give heavier tails on both sides and increasing the degrees of freedom makes the Student-t distribution approaches to a Normal distribution for ν ≥ 30 [13].

III. ESTIMATION AND SELECTION A. Parameter Estimation
One standard method to estimate the parameters of GARCH-type models is the Maximum Likelihood Estimation (MLE)-based method.These methods find the parameter values which maximize the likelihood function.With the same purpose as the MLE-based method, this study first utilizes Excel's Solver's GRG Non-Linear method to estimate the considered models.The GRG Non-Linear method is based on work published by [14], [15].
Excel's Solver is one of the add-ins available for Microsoft Excel that can be used to find an optimal value (maximum or minimum) for non-linear optimization problems.Compared with other tools which require programming knowledge, Excel's Solver tool is preferred by financial practitioners since numerical optimization in many situations can be done by Solver.Following steps of [16], in particular, this study chooses the GRG Non-Linear method as an estimation method.According to [17], the existing values in the worksheet cells for each decision variable are taken as an initial solution such that any small change will improve the objective value.In this way, the objective value will increase if the objective is maximization, or decrease if the objective is minimized until it achieves optimal solution.
Second, we employ the ARWM method introduced by [18] to compare the results of Excel's Solver.Studies in [8]- [9] successfully applied the method in the Bayesian MCMC scheme.The ARWM method is developed to improve the efficiency of the random walk Metropolis algorithm, a type of simplest sampler commonly used in practice.
In the Bayesian framework, one makes statements about the probability of a parameter.Using Bayesian terminology, the estimated probability of a parameter after observing the data is called a "posterior probability", and it is often stated as: where the symbol "∝" means "proportional to".For a parameter θ , the posterior distribution is denoted by f (θ |data), the likelihood function is denoted f (data|θ ), and the prior distribution is denoted by f (θ ).The ARWM method updates a parameter value of θ in each MCMC iteration.Given a set of values θ i and size s i at the i-th iteration, the next iteration of MCMC is completed as follows.
(iv) Calculate: , where m(θ * ) is the frequency of proposal acceptance θ * with the expected acceptance probability 0.44.If s * > s max , then s i+1 = s max ; if s * < s max , then s i+1 = s i .

B. Model Selection
To perform statistical model selection and comparisons, several standard statistical tests and information criteria can be applied.Generally, information criteria such as AIC (Akaike Information Criterion) can be used to investigate the model selection among competing models (including for non-nested models-i.e., situations in which one model is not a particular case of the other) and determine the best fit model particularly [19].The selection of the best model for multiple models for a given dataset is determined by an AIC score [20]: where K is the number of estimated parameters and L is the maximum value of the likelihood function.A lower AIC score is better-in other words, the model with the lowest AIC score is the best.

IV. EMPIRICAL APPLICATION A. Data Description
This study uses the daily returns of the USD currency exchange rate to IDR from January 2010 to December 2017 (consisting of 1891 observations).The data are selected to provide evidence that the AGARCH(1,1), NGARCH(1,1), and VGARCH(1,1) models are more suitable than the GARCH(1,1) model.The continuous return for the time period t − 1 until t is calculated in percentage as follows: R t = 100 × (log(P t ) − log(P t−1 )) where P t denotes the asset price at the time t.
Figure 1 displays the plot of the daily returns series of the USD/IDR exchange rate.The figure shows that the return time series data are stationary, meaning that their fluctuation is around the average.The Augmented Dickey-Fuller test (see [21]) produces a statistic of −44.04 (smaller than the critical value of −1.94) with a p-value of 0.001 (smaller than 5%) which confirms that the data do not contain the unit roots anymore, which is stationary.Therefore, the USD/IDR data satisfied the model's underlying assumptions.To examine whether the USD/IDR exchange rate exhibits conditional heteroscedasticity or the ARCH effect in the return series, we use Engle's Lagrange multiplier test.The ARCH test confirmed that the returns have heteroscedasticity, which is indicated by the greater statistical value than the critical value of 3.84 with a p-value of 0. Therefore, the volatility analysis needs to be done using the ARCH/GARCH model.
Table I gives an overview of the statistical description for the daily return of the USD/IDR exchange rates.At the 5% significance level, the Jarque-Bera (JB) normality test has rejected the Normal distribution for the observed data.The rejection is indicated by the JB statistic (see [22]) is greater than the critical value of 5.99-based on the chi-square distribution table with 2 degrees of freedom.The departure from the non-normality of the data can also be seen from their kurtosis values greater than 3-the existence of heavy tails-and their skewness values not too close to zero-the distribution is not symmetrical.Therefore, the assumption of non-normal distributions is appropriate in our case.The error process is particularly allowed to follow four distribution function types: Normal, Skew-Normal, Skew-Normal Curved, and Student-t distributions.

B. Development of Log-likelihood
Suppose a vector of return series is expressed in a sequence R = {R 1 , R 2 , . . ., R T }.For mathematical convenience, where θ is the vector of estimated model parameters and the process of σ 2 t follows a considered GARCH-type model.

C. Estimation Details
The Excel's Solver's GRG Non-Linear method is applied by following the similar steps of [16].Firstly, the initial values of all unknown parameters are set as follows: In the Excel spreadsheet, for each time corresponding to the return, the variance value of σ 2 t and log-likelihood of log(L (θ |R t )) are calculated based on Eqs. ( 14)-( 16) according to the considered model.Notice that Excel's Solver does not have the strict inequalities (">" and "<") that implies the estimation results may not satisfy the model constraints.For example, a case of α + β = 1 appears in [9], in fact it should be α + β < 1.All options for GRG Non-Linear method are set to their default settings.
This study compares the results of the Excel's Solver's GRG Non-Linear method with the ARWM method.We implement the ARWM method in MCMC algorithm by writing our code in Scilab program.The MCMC simulation is conducted by generating a Markov chain with 6000 iterations for each parameter.The first 1000 samples are removed to eliminate the non-stationarity parts of the Markov chain caused by taking any initial parameter values.The remaining 5000 samples are recorded and used to calculate the posterior means and the 95% HPD (Highest Posterior Density) intervals.See [23] for how to calculate this interval.The prior distribution on parameters (ω, α, β ) is left-truncated Normal distribution of N(0, 1000) as in [24], on parameter ν is exp(0.01)distribution as in [25], and on parameter (γ, λ ) is Normal distribution of N(0, 1000).

D. Estimation Results
We first assume that if the bias-the difference (in relative for our case) of the estimated values from two methodsis very close to zero, two estimating methods are indicated to give similar results [26].The estimation results using Excel's Solver and MCMC when the models are fitted to USD/IDR exchange rate time series are presented in Table II.The result shows that Excel's Solver produces α + β = 1 (Integrated GARCH model) in the Student-t case, except the VGARCH(1,1) model.However, it does not appear to significantly affect the estimation of the other parameters because the values are similar to those obtained by using MCMC method.Notice that the violation occurrs when α + β is very close to 1. Overall, both estimation methods give very similar results.Therefore, Excel's Solver has the potential to be used by financial practitioners with no good knowledge in programming.A disadvantage of using Excel's Solver's GRG Non-Linear method is the unavailability of statistical significance cri-teria for estimate value since the method works based on the gradient/slope of the objective function.Therefore, the statistical significance of the key parameters such as asymmetry and skewness parameters will be based on the 95% HPD interval obtained by using MCMC method.Table III reports HPD intervals at the significance level of 5% for the asymmetry parameter γ.The result shows that the 95% HPD interval of γ excludes 0-means that the estimated parameter is significant-except in the case of AGARCH(1,1) and VGARCH(1,1) models under Student-t specification.These findings indicate that the observed data support incorporating the asymmetry effect of [5].
It mainly showed that γ > 0 for all cases, except for the VGARCH(1,1) model with Student-t distribution.Based on the asymmetric variance process in Eq. ( 3)-( 5), the positive value of γ implies that the past positive returns will result in a more considerable increase in current variance than negative returns of the same absolute magnitude.For the skewness parameter λ , the 95% HPD intervals are reported in Table IV.The intervals indicate a statistical significance at the 5% level for parameter λ in both SN and SCN distributions in each asymmetric model since the intervals exclude 0. This result shows evidence that both skewness specifications must be considered in the distribution of the returns.

E. Model Evaluation
An essential task of modeling is model evaluation.This section evaluates the competing GARCH models regarding their in-sample performance and investigated using AIC.Table V presents the AIC values and ranks each distribution and model according to their AIC values.We first note that Excel's Solver and MCMC give similar results.The results indicate that NAGARCH(1,1) models are the best fit model, followed by AGARCH(1,1), GARCH(1,1), and VGARCH(1,1) models.The only exception is for the Student-t case in which the GARCH(1,1) model outperforms the AGARCH (1,1).This result confirms the previous finding that the asymmetry parameter in both AGARCH(1,1) dan VGARCH(1,1) models is not statistically significant.Moreover, AIC selects Student-t as the best fit distribution for USD/IDR data, followed by SCN and SN distributions.This result confirms the previous finding that the skewness parameter in both SN and SCN distributions is statistically significant.Comparing all results, we can conclude that the NAGARCH(1,1) model under Student-t distribution reflects the most appropriate characteristics of the USD/IDR exchange rate time series.V. CONCLUDING REMARKS This study focuses on the in-sample performance of asymmetric GARCH(1,1) models of [5], including AGARCH, NAGARCH, and VGARCH, in terms of their ability to fit the volatility model for USD/IDR exchange rate return data over a period January 2010 to December 2017.The fitting performance is investigated in four different distributional assumptions for the return errors, namely: Normal, Skew Normal, Skew-Curved Normal, Student-t distributions.The GRG Non-Linear in Excel's Solver and MCMC's ARWM method implemented in Scilab are employed to estimate the considered models.Even though Excel's Solver violates a constraint for the Student-t case, Excel's Solver's GRG Non-Linear method can be said to have a good ability to estimate the asymmetric GARCH models.This is indicated by their estimates similar to those from MCMC's ARWM method.AIC values suggest that NAGARCH(1,1) model under Studentt distribution performs the best in capturing the USD/IDR volatility.The analysis confirms the result in [5] that showed evidence of superiority for the NAGARCH model.The result also ensures the evidence in [27] that Student-t distribution provides a better ability to capture heavy tails than skew normal distribution, even than skew-curved normal.

TABLE I SKEWNESS
, KURTOSIS, AND JB STATS FOR RETURNS OF USD/IDR.

TABLE II THE
ESTIMATION RESULTS USING EXCEL' SOLVER AND THE POSTERIOR MEANS USING MCMC FOR THE USD/IDR DATA.

TABLE III HPD
INTERVALS AT THE 5% SIGNIFICANCE LEVEL FOR γ .

TABLE IV HPD
INTERVALS AT THE 5% SIGNIFICANCE LEVEL FOR λ .

TABLE V AIC
VALUES OF COMPETING MODELS.