Quantile Regression Neural Network Model For Forecasting Consumer Price Index In Indonesia

The main purpose of time series analysis is to obtain the forecasting result from an observation for future values. Quantile Regression Neural Network is a statistical method that can model data with non-homogeneous variance with artificial neural network approach that can capture nonlinear patterns in the data. Real data that allegedly have such characteristics is Consumer Price Index (CPI). CPI forecasting is important to assess price changes associated with cost of living as well as identifying periods of inflation or deflation. The purpose of this research is to compare several method of forecasting CPI in Indonesia. The data used in this study during January 2007 until April 2018 period. QRNN method will be compared with Neural Network with RMSE evaluation criteria. The result is QRNN is the best method for forecasting CPI with RMSE 0.95. Keywords—Consumer Price Index, Neural Network, Nonlinearity, Quantile Regression


I. INTRODUCTION
One of the statistical methods that is used for data analysis is time series analysis. The main purpose of time series analysis is to obtain the forecasting result from an observation for future values. Common modern forecasting method is Neural Network (NN). The advantages of NN method are can be used for non-linear data approach and universal approach with high accuracy and without assumptions [1].
It is common in financial sector, that the data for forecasting have non-homogenous variance. One of the methods for forecast with non-homogenous variance is Quantile Regression (QR). QR method does not need residual assumptions like Ordinary Least Square (OLS) method, such as identic, independence, normally distributed, and there is no heteroscedasticity [2]. Forecasting using QR will produce interval forecasting result.
In this research, QR will be applied in Customer Price Index (CPI) in Indonesia because it assumed to have heteroscedasticity problem. CPI is a measure that examines the weighted average of prices of a basket of consumer goods and services, such as transportation, food and medical care. CPI forecasting is important to assess price changes associated with cost of living as well as identifying periods of inflation or deflation [3]. Previous study about forecasting CPI with ARIMA model has been done in Bandar Lampung [4] and Rwanda [5].
Accuracy of CPI is the most important thing in order to make informed decisions about the economy. In according to increase the accuracy, this research will used hybrid method. In general, the combination of two methods will be better than using one method [6]. QR method will be combined with Neural Network or known as Hybrid Quantile Regression Neural Network. This method has been used to estimating conditional density of multi-period return [7]. The result is Quantile Regression Neural Network better than GARCH when using empirical distribution and tend to have same performance with GARCH when using Gaussian distribution. Forecasting using combination of Quantile Regression and Neural Network or later known as Robust Neural Network (RNN) in the credit portfolio data gives results that the method is more resistant to outlier when compared with linear regression or spline regression [8].

A. Terasvirta Nonlinearity Test
Terasvirta nonlinearity test is conducted to find out whether the data are linear or nonlinear patterns. The Terasvirta test uses the F test and available tribes result from the Taylor series expansion [9]. The hypothesis for the Terasvirta nonlinearity test is as follows.
3. Calculate the statistic F test with the following formula: with n is the number of observations and p is a number of order. 0 H refused if ( , 1 ) m n p m FF     which means that the model is not linear.

B. Quantile Regression
Quantile Regression (QR) was first introduced by Koenker and Bassett (1978). This approach assumes the various quantile functions of a Y distribution as a function of X. Quantile Regression is useful if the distribution of data is not homogeneous (heterogeneous) and not standardized as is not symmetric, there is a tail in distribution, or truncated distribution. Here is the pinball loss function of quantile regression [10].
With 01  . For example, the predictors ( ), 1,..., slope coefficient i m , and intercept b , linear regression equation for ˆt y conditional on the  quantile are as follows The coefficient of the equation can be estimated by minimizing the quantile regression error function as follows () Yt being the response observed at time to 1,..., tn  .

C. Neural Network
Neural Network is a technique in machine learning that has been developed as a generalization of the mathematical model of the biological nervous system. Learning methods in neural networks can be classified into three, namely supervised learning, unsupervised learning, and reinforcement learning. The neural network model commonly used in forecasting is feed forward neural network [1]. The following is example architecture model of feed forward neural network with three layer. The relationship between output t Y and input 12 , , ,...,  is depicted in the following mathematical equation: model parameter or often referred to as a weight, p is the number of input nodes, and q is the number of hidden nodes. The usual function used in the hidden layer is the logical function or tangent hyperbolic. Equation (7) establishes a nonlinear mapping of the observed past value 12 ( , , ,..., ) to the future value of () t Y with the following equation, 12 ( , , ,..., , ) where w is the vector of all parameters and f is the function determined from the network structure and the weights. Thus, the neural network as in the picture is equivalent to the nonlinear autoregressive model [13].

D. Quantile Regression Neural Network
Quantile Regression Neural Network is a combination of Quantile Regression and Neural Network methods. For example known predictor () i xt and response () yt . The output of the hidden-layer node j is as follows [10].
where u is the residual of the parameter estimate and () hu is Huber norm with the following equation With  is the threshold whose value is determined. The error function to be optimized is given in the following equation

E. Consumer Price Index (CPI)
The Consumer Price Index (CPI) is a measure to examines the weighted average of prices of a basket of consumer good and services, such as transportation, food and medical care. It is calculated by taking price changes for each item in the predetermined basket of goods and average value of them. Changes in the CPI are used to assess price changes associated with the living cost. The CPI can also be used as a deflator for other economic factors, including retail sales, hourly/weekly earnings and the value of a consumer's dollar to find its purchasing power [3].
The BLS records about 80,000 items each month by calling or visiting retail stores, service establishments (such as cable providers, airlines, car and truck rental agencies), rental units and doctors' offices across the country in order to get the best outlook for the CPI. The formula used to calculate the Consumer Price Index for a single item.

Cost of Market Basket in Given Year 100 Cost of Market Basket in Base Year
The base year is determined by the Bureau of Labor Satatisics (BLS). CPI data for the years 2016 and 2017 were based on surveys collected in 2013 and 2014 [11].

A. Data Description
Before analyzing the data, here is the time series plot for CPI data.  Fig. 2, it can be seen that the CPI data is not stationary in the mean. In order to stationary of CPI data in mean then must be done differencing with first order. This stationary data to be analyzed using NN and QRNN. To be able to use the NN and QRNN methods, the data should not normally distributed and not linear. For CPI data, here is the normality test using Kolmogorov Smirnov and linearity test using Terasvista test.

B. Modeling Using Neural Network
In the NN method, the input variable is a significant lag of PACF from the already stationary input sequence, while the output variable is the CPI data.  Fig. 4 above, it can be seen that the significant lag is lag 1, 2 and 5, then this lag will be the input variable. With input variables lag 1, lag 2 and lag 5, CPI data as output variables, normalized transformation, and sigmoid activation function, the following results are obtained.  Parameter estimation using NN as in Table II.

C. Modeling Using Quantile Regression Neural Network
Modeling using Quantile Regression Neural Network with two kinds of model as follows.
1. One-step model using significant lag in PACF from stationary data as input variable and CPI as output variable.
In the QRNN method, same as NN method, the input variable is a significant lag of PACF from the already stationary input sequence, while the output variable is the CPI data. Same as the NN method, the CPI data must be transformed normalized. RMSE for some quantiles are obtained as in Table III.  Based on PACF residual NN3 in Fig. 6, results a significant lag of lag 1 and lag 2 as input variable. Same with NN method, the CPI data must be transformed normalized. RMSE for some quantiles are obtained as follows in Table IV.

D. Comparing the Performance of Neural Network and Quantile Regression Neural Network
Based on RMSE and time series plot each method obtained the best method is QRNN. For CPI data can be drawn conclusion that is by using normalized transformation, the best method for CPI data is by using QRNN method with RMSE is 0.95 as in Table V and visualized by Fig. 7.

CONCLUSSION
Based on using NN and QRNN method on CPI data, can be drawn conclusion that is by using normalized transformation, the best method for CPI data is by using QRNN method with comparing the value of RMSE. The results of RMSE for QRNN is 0.95.

SUGGESTION
Suggestion for further research on ARIMAX model, outlier detection is done to assume the normal distributed residual can be fulfilled. Then, do a strategy to avoid crossing the QRNN forecast hose.