Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java Using Geographically Weighted Bivariate Generalized Poisson Regression

— Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data the number of infant mortality and maternal mortality are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called Geographically Weighted Regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with Adaptive Bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.


I. INTRODUCTION
NE of important indicators in determining the level of public health in a region is infant mortality rate (IMR) and maternal mortality rate (MMR). Both variables are related. At the time of the baby in the womb, nutrients are obtained from their mother's body through the placenta. Therefore, the condition of their mother will affect both fetus and baby being born. In addition, when a new baby is born until the age of one year, the role of their mother is also very influential in the growth of the baby. Poisson regression is a non-linear regression that has equidispersion assumption where both mean and variance are equal. Both infant and maternal mortality rates are correlated count data with different variance and mean. Previous research has found that infant and maternal mortality II.LITERATURE

A. Poisson Regression
The Poisson regression is a distribution for events that have a small probability, which the event depends on a certain time interval or in a particular region with observations using discrete variables and independent predictor variables. The Poisson distribution has the following probability model [5].
a is the number of parameters below population and b is the number of parameters below H0. If the test decision simultaneously is rejected H0 then the next step to do partial parameter testing to determine which parameters that give a significant influence on the model.

C. Spatial Effects
The modeling of spatial data can be grouped based on the spatial data type ie spatial point and spatial area. In spatial data modeling, it is possible variations in terms of the region called spatial heterogeneities which indicate differences in regional relationship. The hypothesis used to see if there is spatial heterogeneity were described as H0 : Distant locations also show spatial diversity. Spatial diversity between location is indicated by a weighted W matrix which is a function from Euclidian distance of locations. One alternative weighting function is Adaptive Bisquare Kernel function. The adaptive kernel function is a kernel function that has different bandwidth at each location [7].
* ii d is the Euclidian distance between the i-location and the i*-location whereas i h is the smoothing parameter or the bandwidth of the i-location. The Adaptive function will have different bandwidths according to the data density in the analysis area. When the data is solid, the bandwidth will be small, whereas when the data is sparse, the bandwidth will get bigger. This function is able to adjust the size of data variance [8]. The selection of the optimum bandwidth can be done by the Generalized Cross Validation (GCV) method defined by equation (13) location not included in the estimation.

D. Geographically Weighted Bivariate Generalized Poisson
Regression (GWBGPR) The GWBGPR model is an extension of the Bivariate Generalized Poisson regression (BGPR) by using geographical weighting in its parameter estimation. The form of GWBGPR equation is as follows The Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) parameter estimator is performed by using the Maximum Likelihood Estimation (MLE) method. MLE method maximizes likelihood function. GWBGPR parameter estimation methods use MLE and likelihood function are described as shown in Formula 15. ,  ,  !   ,  ,   ,  ,  ,  exp  !   ,  ,   ,  ,  ,  ,  ,  exp   ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  , At first, the derivative is not close form, so to obtain the GWBGPR estimator is done by using the Newton-Raphson iteration is described as shown in Formula 17.
Testing of Geographically Weighted Bivariate Generalized Poisson Regression model is used to know the significance of a and b parameters together with the following hypothesis H0: is the distribution approach with degrees of freedom (a-b), where a is the number of parameters below population and b is the number of parameters below H0.
then there are predictor variables that affect the response variables with alpha is a significance level [4].

E. Correlation
The correlation coefficient is an indicator or a value in the linear relationship between two variables [9]. The correlation coefficient is defined as in (19) The correlation coefficient can show two relationships, namely positive and negative. Correlation test for response variables is done with a hypothesis as follows H0 : there is no relationship between Y1 and Y2 H1 : there is a relationship between Y1 and Y2 Test statistics:

F. Multicollinearity
Multicollinearity is a condition in which predictor variables are highly correlated. The existence of a multicollinearity case may result in an inaccurate parameter estimate. The VIF value shows how the variance of the parameter estimation results of increases due to the presence of multicollinearity.

G. Best Model Selection
The use of many variables in multivariate regression analysis can be difficult to determine the effect of predictor variables on response variables. Therefore, the selection of variables is done to obtain the best regression model. Variable selection procedure with Mean Square Error (MSE) criterion. The best model has the smallest MSE value. MSE is the average of the squared estimation error were described as shown in Formula 21.

H. Infant Mortality Rate and Maternal Mortality Rate
Infant deaths are deaths occurring shortly after birth until the baby has not been exactly one year old. The causes of infant mortality are divided into two types namely endogenous and exogenous [10]. Maternal mortality is the death of a woman who occurs during pregnancy up to 42 days after the end of pregnancy, regardless of the length and place of pregnancy, but not due to an accident. Maternal mortality is categorized into direct death and indirect death [11].

III. METHODOLOGY
The data used in this study is secondary data obtained from the Central Statistics Agency of East Java and Health Profile of East Java 2016 issued by East Java Provincial Health Office. The observation unit of 38 observation units consisted of 29 districts and 9 cities. The study variables used in this study consisted of two response variables and five predictor variables.

A. Descriptive Statistics
East Java is one of the most populous provinces in Indonesia. It is located 111.0° and 114.4° east longitude and between 7.12 ° and 8.48 ° south latitude. East Java divided into 38 regencies with 29 districts and 9 cities. In the early stages of this research, descriptive statistics of variables that allegedly affect the number of infants and maternal mortality in East Java were described as shown in Table 2.

C. Correlation Test between Variables and Multicollinearity
The correlation coefficient between response variables may indicate whether the number of infant mortality has a correlation with the number of maternal mortality or not. The hypothesis for the Correlation Test between response variables are stated as follows: H0: There is no correlation between Y1 and Y2 H1: There is a correlation between Y1 and Y2 The test statistic used in this test is expressed as follows: . Therefore, there is a significant relationship between the infant and maternal mortality cases. The criteria that can be used to detect the presence of multicollinearity cases is VIF value. If the VIF value is greater than 10, the multicollinearity case exists. The VIF values of the predictor variables in this research are shown in Table 3.  Table 3 shows that all predictor variables have VIF values lower than 10, thus the predictor variables are not mutually correlated. In conclusion, there are no multicollinearity cases in the predictor variables used. Therefore, all predictor variables can be used for further analysis of BGPR and GWBGPR modeling.

D. Infant and Maternal Mortality Modeling with Bivariate Generalized Poisson Regression
A concurrent testing is needed to determine at least one used variable which will affect the model formed with the following hypothesis as stated below.
The variables that significantly influence the number of infant mortality with a significant level being used is 5% are pregnant women visit health workers at least 4 times during pregnancy, pregnant women received Fe3 tablets, obstetric complication handled, confinement service by health personnel, and clean and healthy households. Whereas, there are five variables that significantly affect the number of maternal mortality. The variables are the percentage of pregnant women visit health workers at least 4 times during pregnancy, pregnant women received Fe3 tablet, obstetric complications handled, clean and healthy households, and married women with the first marriage age under the age of 18 years. AIC obtained from BGPR modeling is 5,682.156.

E. Spatial heterogeneity test
The diversity of regional's characteristics for the number of infant and maternal mortality data as well as variables that affect it can be identified through spatial heterogeneity testing with the following hypothesis: H0:

F. Infant and Maternal Mortality Modeling with GWBGPR
A concurrent testing is needed to determine at least one used variable which will affect the model formed with the following hypothesis as stated below.
, , , 0 In each addition of 1% of pregnant women health workers at least 4 times during pregnancy, the average number of infant mortality will decrease by 0.989 times with the assumption that other variables are constant. If living clean and healthy households in Jombang increased by 1%, the number of infant mortality will decrease the number of infant mortality by 0.999 times and the increase of 1% of married women with first marriage age under 18 years will increase the number of infant mortality by 1.003 times. Every 1% of obstetric complications handled and delivery assisted by health personnel will each increase the number of infant mortality by 1.008 and 1.050 times. Variable obstetric complications are handled and delivery assisted by health personnel has a non-conformity in the direction of the relationship in the model of infant mortality. This is due to the dependency between the predictor variables or the data retrieval process. The data used is the total data for one year, where the development of the size of each variable is not always constant in one year and not necessarily represent the condition every month. The changed signs are caused by the correlation between obstetric complication variables handled and delivery assisted by health personnel to the number of infant mortality indicating a positive correlation pattern and this is not in accordance with the fact that should have a negative correlation pattern. Each addition of 1% of pregnant women who get Fe3 tablets will decrease the number of maternal mortality by 0.988 times with the assumption that other variables are constant. If a married woman with a first marriage age under the age of 18 increase 1%, the number of maternal mortality will increase by 1.025 each with the assumption that other variables are constant. The addition of 1% of obstetric complications handled and living clean and healthy households will increase the number of maternal deaths by 1.005 and 1.019 times. The variables of obstetric complications handled and living clean and healthy households have a mismatched relationship in the model of infant mortality caused by dependencies between the predictor variables or the data retrieval process. The correlation pattern between obstetric complications handled and living clean and healthy households on the number of maternal mortality is not in accordance with reality. The pattern of correlation between the two variables on the number of maternal mortality has a positive pattern but in fact, the pattern should be a negative correlation and this is causing the change of sign.

G. Selection of the Best Model
The MSE criteria are used to evaluate the performance of predictors or estimators. MSE is also used to convey the concept of bias, precision, and accuracy in statistical estimation. The MSE calculations for regression models with multivariate responses are as follows  Table 4 shows the MSE values of each modeling method. Based on these results, the smallest MSE values were obtained from the GWBGPR model to model the number of infant mortality and the number of maternal mortality.

V. CONCLUSION
The conclusion obtained from Geographically Weighted Bivariate Generalized Poisson Regression modeling using Adaptive Bisquare Kernel weighing produces 3 districts groups on infant mortality and 5 districts groups on the number of maternal mortality. Variables that significantly affect the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women receiving Fe3 tablets, obstetric complications treated, living clean and healthy households, and married women of the first marriage age under 18 years of age. Based on these results it can be seen that the smallest MSE values are obtained from the GWBGPR model.