Generating Hourly Rainfall Model using Bayesian Time Series Model ( A Case Study at Sentral Station , Bondowoso )

Disaggregation of hourly rainfall data is very important to fulfil the input of continual rainfall-runoff model, when the availability of automatic rainfall records are limited. Continual rainfall-runoff modeling requires rainfall data in form of series of hourly. Such specification can be obtained by temporal disaggregation in single site. The paper attempts to generate single-site rainfall model based upon time series (AR1) model by adjusting and establishing dummy procedure. Estimated with Bayesian Markov Chain Monte Carlo (MCMC) the objective variable is hourly rainfall depth. Performance of model has been evaluated by comparison of history data and model prediction. The result shows that the model has a good performance for dry interval periods. The performance of the model good represented by smaller number of MAE by 0.21 respectively.


I. INTRODUCTION
he simulation of continuous rainfall is an important area of hydrological research, particularly within the context of flood estimation.For flood estimation, longterm period rainfalls with high resolution (hourly rainfall) are needed.However, it is difficult to provide such data in Indonesia due to limitation in observation on the field.To overcome this problem, generating hourly rainfall using time series model is one of alternatives to get forecasted rainfall data.Some research developments to generate higher temporal rainfall have been done in many countries.At first, it was done to disaggregate data on rainfalls and floods from yearly to monthly by [1] by time series approach, Period Autoregression (PAR) [2].However, the model made could not keep covariant condition on lower level of variable forming series in its period.
Development of further disaggregation to smaller time scales (daily to the scale below) by using different ways are developed [3][4][5][6].The latest research used a simple mathematical model but had not provided accurate results.Wong disaggregates Bartllet-Lewis model by trying variations of model parameters there were optimized with evolutionary algorithm [7].Burian carried out different approaches by using artificial neural networks (ANN) to disaggregate hourly rainfall data into finer time intervals [8][9].Koutsoyiannis et.al, developed a rain disaggregation using adjusting procedures on the Poisson cluster model cumulative hourly rainfall height with the Beta distribution and the incidence of rain with a geometric conditional distribution on the total daily rainfall [10][11].This model was tested for hourly data in South-Western UK, U.S. and in the Tiber River, Italy by Fytilas.The results indicate that this methodology has a good performance.This model is to facilitate the operation formed in Heytos program package.
Inspired by Koutsoyianis success in modelling the temporal spatial rainfall, Hidayah, et.al, tried to apply disaggregation model of temporal rainfall of Heytos at locations in Sentral rainfall stations in the catchment areas of Sampean, East Java [12].Results from model implementation indicated that this performance was not good.This was demonstrated by the marginal moments, and spatial and temporal correlation between the proportion and length of dry intervals and the other for the months of February to November yet produced a good reproduction of the actual hydrograph.While the months of December and January provide a bad enough Generating Hourly Rainfall Model using Bayesian Time Series Model (A Case Study at Sentral Station, Bondowoso) Entin Hidayah 1 , Nur Iriawan 2 , Nadjadji Anwar 3 , and Edijatno 3 T result with error of 0.51 and 0.79.Model Heytos is appropriate to disaggregate rainfall in sub-tropic region which has gamma-formed distribution.
Onof et al, have constructed this kind of model [13].The model was based on random parameter of Bartlett-Lewis rectangular pulse rainfall combined with multiscale disaggregation applied to an urban area in Denmark.The result of model performance increased 2-15% to extreme rainfall condition.Koutsoyiannis et.al, have developed rainfall disaggregation using adjusting procedures on a Poisson cluster model and examined for hourly data in UK and US [10].The results indicated good performance methodology.
Bayesian approach has succeeded to reduce uncertainty of the spatial rainfall prediction.The development of this research was done by Todini et al, in the spatial rainfall prediction using Bayesian method which is combined with the rainfall radar, satellite, and local measurement [14].Sahu et al, Bayesian approach has used Bayesian approach to estimate the spatial rainfall model combined with the rainfall radar, and local measurement [15].Lima has succeeded to employ Bayesian method for predicting the daily rainfall occurrence [16].Furthermore, from those previous researches, it can be seen that the development of Bayesian model is merely used to estimate spatial aspect of rainfall.On the temporal aspect, however, it has not been done at al.
Considering implementation the result of temporal model and succeed of Bayesian approach, this paper attempts to develop Bayesian temporal for rainfall data.
The theory supporting this research consists of the following:

A. Rainfall Disaggregation
Rainfall disaggregation is a method to transform the synthetic hourly rainfall (lower scale) derived from the rainfall data at higher scale (daily, or weekly).The formation of this synthetic data is derived from the rise of a stochastic model.Disaggregation has an important role in the hydrological model application because it can produce higher resolution rainfall data.Carpenter et al, have examined high resolution rainfall data that resulted smaller model error [17].Disaggregation model is developed to produce more than one statistical aggregation.Disaggregation can be used both temporally and spatially.
Disaggregation is generated from random simulation.Disaggregate is formed by setting parameters, which are further to control the process of disaggregation model.The results of random simulation are fluctuating dependent on the parameters used; therefore, variations of some historical data and simulation of statistic models are expected to generate disaggregation in time series.Evaluation of the results can be done by calculating a model such as skewness of the simulation results of time series disaggregation, and the results can be compared with the historical data.

B. Stationary Time Series Models
Time series model is a stochastic model.A stochastic process (Y (t), t ∈ A) is a set of random variables, where A is the set of indices [18].Index is interpreted as a time and Y (t) is called the process at time t.Stationary time series model is the simplest AR model.AR model is formed through a regression that links the current values with previous at each change with the time lag or time interval varies [19].General form the model of AR orde p is [2]: where assumes that hydrology process is represented by Y, (

C. Seasonal Autoregressive Model (AR1)
This model assumes the occurrence of rainfall periodically, represented by Y ν , τ, ... where ν defines year, and τ defines the season, for example τ = 1 .., ω, and ω is the number of seasons during the year.General form of the Period Autoregression (PAR 1) model is [2]:

D. Autocorrelation and Autocorrelation Partial
The coefficient of autocorrelation and partial autocorrelation are a major tool for analyzing time series data.The coefficient of autocorrelation is a function that shows the magnitude of correlation (linear relationship) between the observation time to t denoted by Z t with previous time (denoted by Z t-1 , Z t-2 , ... Z t-k ).The value of autocorrelation function of a time series Z 1 , Z 2 , ... Z k ,    is as follows [16]: Partial autocorrelation is used to measure the level of closeness between Z t and Z t-k .If the effect of lag time 1,2,3,4 ... k-1, is considered separate.Autocorrelation function is a function which shows the partial correlation between observations in time to t and the previous times.
The formula of partial autocorrelation or  kk is: The value of  kk can be determined through the equation of Yule walker and the result is as follows:

E. Bayesian Approach
In the estimation theory, there are two popular approaches, i.e. the classic statistics approach (frequentist) and Bayesian statistics approach.Classic statistics is fully determined by the inferential process based on sample data from the population.In contrast, Bayesian statistics uses not only the sample data from population but also employs a prior distribution of each parameter.The Classic statistics approach assumed the parameter as a stationer parameter (constant or single value).On the other hand, Bayesian statistics approach assumed the parameter as having distribution, called prior.By combining these information, sample data used for calculating the likelihood and prior distribution of parameter, the posterior distribution of each parameter could be determined.Then, the estimated parameter could be derived from this posterior.
Bayesian statistics has a simple way to solve the problem of multidimensional parameter estimation.Bayesian theory encompasses the way to predict parameters together with their distribution directly.That gain is not easy to be done by commonly traditional statistics.Due to the above reasons, this research would use Bayesian approach to estimate the model.
Bayesian model is developed from the bayes theory phenomena.In the discussion about the distribution or model estimation, the bayes is used as tools or called bayesian method.In the distribution, bayesian parameter or model is needed as random distribution.Bayesian model can be represented as: where, x : set of data constructing Bayesian model : unknown parameter for the prior distribution to explore the characteristic of posterior distribution of p( ) : prior distribution a l(x| ) : likelihood function of the probabilistic pattern of data p(x) : a normalized constant p( |x) : posterior distribution representing weather x data would be given.
Inference of posterior distribution of each parameter in would be gathered by integrating it over the space of parameter needed

F. MCMC
Method of Markov Chain Monte Carlo (MCMC) facilitate modeling complex enough, so it is considered as a cc in the use of Bayesian analysis [20].There are several techniques available for numerical integration, and the most existing methods are very connected with the idea that there are at integral.Monte Carlo integration is a technique that can be done to obtain an expected value (expectation).In a simple form, it can be written: where the values x 1 , x 2 , ….. , x n can be obtained freely on the density p (x) in the interval (a, b) in its simplest form can use a uniform distribution (a, b).
In Bayesian analysis, the use of MCMC can simplify the analysis, so the decision taken by the analysis will be done quickly and accurately.There are two conveniences gained from the use of MCMC methods in Bayesian analysis [21].First, MCMC method can simplify the integral form of the complex with large dimensions to be an integral form with one dimension.Second, using the MCMC method, the data density estimates can be described by generating a sequential Markov chain of n.

G. Gibbs Sampling
One of the MCMC approaches is by using Gibbs sampling [22].Gibbs Side is a technique for generating random variables from the marginal distribution indirectly without having to calculate its density.By using Gibbs sampling, difficult calculations can be avoided [22].
The use of Gibbs Sampling on an analysis of data is aimed to obtain data for each parameter, θ k individually from the full conditional form of distribution of all parameters to the data Therefore, obtaining samples of each parameter is done by forming all the model parameters into a parameter vector in the form of a special partition, that is: =( k ,  k-).

H. Model Evaluation
Model evaluation is intended to determine best model.The determination of the best model can be evaluated according to the resulted error value between the simulation model against the observational data.One such measure is the Mean Absolute Error (MAE).Criteria of MAE are formulated as follows:

III. RESULTS AND DISCUSSION
This rainfall disaggregation of time series model was implemented in the Sentral Station, East Java.AR1 time series model is used to build the rainfall disagregation since AR1 model tend to match with any form of distribution of rainfall.
Condition in Indonesia is the tropical regions where rainfall is high and uneven, so it is difficult to obtain an appropriate distribution.Descriptive analysis of time series of hourly rainfall in the Sentral Station at Bondowoso is presented on Table 1.The table indicates that there is a high average rainfall, the variance and low temporal correlation structure of consecutive ranged below 0.05, 0.6, and 0.26.In contrast, dry interval proportion value is high in the range above 0.99.Average of the highest rainfall level occurs in February and the lowest average of rainfall height occurs in July.This phenomenon can be concluded that the wet months occur from October through April with a high average rainfall ranging from 0.08 mm to 0.43 mm.Conversely, low dry months occur in the month of May to September with an average height below 0.05 mm of rain.Description of wet months and dry months will be clearer when seen from the data pattern of hourly rainfall height per year in a series on Figure 1.On the figure, the wet months are marked with a grouping of rain occurrences.
The research result of rainfall disaggregation using Heytos Program at Sentral Station previously [12] on December had good result by error of 0.51 if compared with other months [21].Therefore, in this research December would be used as a reference to generate rainfall disaggregation model.
The implementation of rainfall disaggregation in this research used time series approach.Determining the suitability of the model in time series began with the examination of data stationary.The results of plots of PACF functions were undertaken using Minitab 14. Figure 2 shows that there are decreasing after lag 1in i  positif and negatif.24-hour interval values are positively correlated in excess confident interval for alpha = 0.05 (T value falls in the domain of rejection).
The stationary data were obtained with repetition of each lag 24.It can be concluded that the data contain seasonal daily.
From the results of PACF examination, the appropriate method to disaggregate rainfall data is seasonal AR1.This modelling used the tools WinBUGS1.4.WinBUGS is a programming language based software that used to generate a random sample from posterior distribution of parameter a Bayesian model.The use of this WinBUGS1.4 in adjusting process as well as in modelling is structured in Doodle Figure 3.
The figure is a program code for seasonal AR1 program structure of seasonal AR1 is composed on Doodle of WinBug.Variable y [i] is normal distribution with a reverse word-dnorm (mu [i], tau) where the logical node mu [i] follows the next statement translated in accordance with logical in doodles of Figure 3. mu [i] is the expectation of seasonal AR1 model prepared by the equation: ) tau connected with y [i] is a variance from Seasonal AR1.Prior distributions for each model parameter of models a, b, and c are based on the validation of models to get the smallest error.Dummy is fitted in the model to get the value o when there is no rain.Adjusting procedures are performed outside the seasonal AR1 model which is directly connected with x [i] and mu [i] to get the value of the difference of ZB [i] and Za [i].Error was also calculated directly in the logical model based on the value of MAE.
The result of running the models which are all used in WinBUGS1.4for normal component, supported by the data 2952 and Gibbs sampler iterations are performed totally 40000 times in a personal of computer (Intel Centrino Core 2 Duo Processor P8600, 2.4 GHz, RAM 6 GB).The process request 32 seconds to finish the simulation.The result of disaggregation rainfall and on their model parameters are shown on Table 2.
The estimated posterior density of model parameter (a,b, and c) was approached by kernel density in Figure 4.The result was satisfying because the form of density was smooth.The distribution pattern of each rainfall parameters in row tends to have a symmetric pattern centred in 2.915, and 3.344 for a parameter, 0.3149, and 0.4728 for b parameter, and 0.2112, and 0.3226 for c parameter Performance of model could be accessed by Bayesian MCMC simulation result, the estimated posterior density of each parameter, the result of Running Quantil, and Autocorrelation.The evaluation of Bayesian MCMC simulation result for parameters a,b and c in Figure 5.
(iteration of model history) shows that the generated data have fast mixing.It means that the estimate process can respond for parameter value.Running quantiles Figure 6.shows straight line between upper quantiles and lower quantiles.It means that the model has reached stationer and convergence Auto correlation in Figure 7 shows that the result of model simulation has fulfilled Markov chains characteristic where the generated data would only be influenced by one prior occurrence.Comparative results between observation data and simulation models using Bayesian seasonal AR1 indicate the error resulted from the calculation of MAE value of 0.21.With the same data if Heytos model, error generated was 0.51.This means that there is an increase in performance results of model using a Bayesian seasonal AR1 model.Clear description of sequent data between the simulation results of the observations can be seen in the Figure 8.

IV. CONCLUSION
The result of using Bayesian Seasonal AR1 model with adjusting procedure and dummy for tropical region has a good performance to generate the rainfall disaggregation characteristic data which have varying rainfall distribution.A good performance model is characterized by the very short time required for the iteration process models (32 seconds) and the resulting error of MAE 0.2.
The development of the AR1 model with dummy procedure capable of providing zero value when there is no rain, However the process to be done manually, so that cannot be used to predict rainfall disaggregation model for future development.In order for the process to be done automatically, it is necessary to add structure model with a binary process.
Z = || Z -Z ~|| with the estimated parameters to gain smaller distance value of ΔZ until the approved limit and choosing final arrangement of s

Figure 1 .Figure 3 .Figure 5 .Figure 7 .
Figure 1.Rainfall depth series in 2004-2008 Figure 2. Result of PACF Plot for daily rainfall data is a white noise process of time to t with mean zero and constant variance σ 2 a ; and  1 ,  2 ,...., p are autoregresive coefisient of orde p. AR1 model takes the simple form as follows:

TABLE 1 .
VARIATION OF AVERAGE, VARIANCE, TEMPORAL CORRELATION STRUCTURE (AUTOCORRELATION) AND THE PROPORTION OF THE DRY INTERVALS OF HOURLY RAINFALL DATA AT THE SENTRAL STATION