Searching the Appropriate Minimum Sample Size Calculation Method for Commuter Train Passenger Travel Behavior Survey

Commuter Train is one of the facilities that must be managed properly, economically and efficiently by the principles of Facility Asset Management. The availability of infrastructure and vehicles for this facility is adjusted based on passenger demand. This requires sufficient knowledge on the travel behavior characteristics, i.e., different characteristics composition proportion. Travel behavior survey requires the appropriate formula or method to calculate the minimum sample size, for this case are proportions of pq, pqr, pqrs etc. Therefore, a search for Minimum Sample Size Calculation Method for the Travel Behaviour Survey is needed. A literature study was employed for this search. This is important because the calculation method for the minimum sample size for proportions pq exists, but for the proportion of pqr, pqrs, etc do not yet exist. The results of the study indicates that the SR Method is the most appropriate method for calculating the minimum number of samples for the case of the proportion of pqr, pqrs, pqrst, etc. The SR Method is developed based on Goodnees of Fit method combined with the Maximum Acceptable Error principle. The combination of the two is named the MAECCL (Maximum Acceptable Error on a Certain Confidence Level) principle. Keyword : facility asset management, commuter train service, passenger demand, travel behavior, the minimum sample size


INTRODUCTION
Facilities are important components that must be managed properly, economically and efficiently according to the principles of Facility Asset Management. One form of transportation facilities in the urban area is the Commuter Train service. The existence of the commuter train service, of course, requires the existence of infrastructure and vehicles. The availability of infrastructure and vehicles is adjusted to passenger demand (Soemitro & Suprayitno, 2018). The plan of the commuter train service certainly requires sufficient knowledge of the commuter train passenger travel behavior behavior (Susanti, Soemitro & Suprayitno, 2018). The parameters for the travel behavior characteristics are the number of trips, the origin and destination, the transportation used and the selected route (Khisty & Lall, 2003;Suprayitno & Upa, 2016).
Research on travel behavior characteristics collects data on passenger characteristics and travel behavior characteristics. The passenger and travel behavior characteristics consist of certain proportion compositions. The proportion compositions for the passenger characteristics are, for example, gender, marital status, age, education, and employment. Travel behavior characteristics, among others are the compositions of the trip purpose, the mode used, and access and egress the distance.
The proportion compositions for each characteristic vary in number. The proportion compositions for gender are two consisting of a male (p) and female (q); while the proportions of the trip purpose, consists of nine compositions pqrstuvwx. The difference in the number of compositions requires the appropriate formula to calculate the minimum sample size. The formula for calculating minimum sample size is an important step for the research/study because these sample size are very closely related to the level of error (Noordzij et al 2010).
Therefore, a preliminary study is needed to search for the appropriate minimum sample size calculation method for proportion cases.

METHOD
The method used to find the Minimum Sample Size Calculation Method for the research survey of travel behavior characteristics of commuter train passengers is conducted through literature studies (see Figure 1).

RESEARCH ANALYSIS Proportion Composition on Travel Behavior Characteristics
There are two important points on the research of travel behavior characteristics of commuter train passengers, namely: the passenger characteristics and travel behavior characteristics. In passenger characteristics, there are proportion compositions of gender, marital status, and age; while in travel behavior characteristics among other it observe the proportion composition of the traveler and modal intent before using the commuter train (see Table 2).

Examples of Case Studies of Travel Behavior Characteristics
Research discussing travel behavior characteristics has been carried out by many researchers previously. One travel behavior research related to discuss the travel distances to the BRI Kertajaya Office. This study involved the proportion of pqrs case for the access distances. The proportion composition was presented in Tables and Charts of the proportion distribution curve (see Table 3 and Figure 2) (Suprayitno, Ratnasari & Saraswati, 2017).

Calculation Formula for Minimum Sample Size
The formula to calculate the minimum sample size can be divided into four categories : 1) the minimum sample for the average value; 2) the minimum sample for proportion value; 3) the minimum sample for the variance value; and 4) the Slovin formula. Those formulas are gotten from several literature studies. In the first three cases, there are more than one formula found; while in Slovin Formula, there is only one formula on how to calculate the minimum sample size. These formulae are showed in Table 4.  Table 4 shows a summary of the several types of minimum sample size calculation formula which can be described as follow:  The formula of minimum sample size for average value cases cannot be used in the research of travel behavior characteristics because it relates to the proportion value instead of the average value.  The formula of minimum sample size for the proportion value case can be used in the research of travel behavior characteristics as this research relates to proportion case.  The formula of minimum sample size for the variance value case cannot be used in the research of travel behavior characteristics, because this research relate to the proportion value.  The Slovin formula is not appropriate to calculate minimum sample size in the travel behavior research because this formula does not contain a proportion; in Slovin formula, the population must be known in advance and the formula cannot indicate the quality of the sample in terms of MAECCL (Suprayitno, Saraswati & Fajrinia, 2016).

Calculation Trial for the Minimum Sample Size by Using the Proportion Formula
The formula for the proportion case was obtained, and it is designated only for the proportion of pq. While, the proportion dealt in travel behavior are various, means proportion cases of pqr, pqrs and so on. Therefore, the existing formula has to be tested for the proportion case of more than pq, i.e pqr, pqrs, etc. (see Table 5).   Table 5 shows the minimum sample size calculation result. Three important points must be noted.
 The number of minimum sample size for the proportion of pq was 75 people.  The number of minimum sample size the for proportion of pqr was 13 people.  The minimum sample size for the proportion of pqr should be more than the minimum sample size for the proportion of pq. Therefore the formula for the proportion case cannot be used to calculate the minimum sample size for the proportion case of pqr and so on. Thus, how to calculate the appropriate number of minimum sample size for the proportion case of pqr, pqrs, pqrst and so on, is still in question.
The proportion can be presented in a table or as a graph. The Goodness of Fit Test is designated to test wether two graphs are the same or not. An experiment has been done to test wether this test can be used for calculating the minimum sample size for the proportion case.

Minimum Number of Sample Calculation by using Goodness of Fit Test.
In the statistical technique, there are inferential statistics. One of them is the Goodness of Fit. This is usually used to check whether the sampling distribution curve is similar to the the population curve. The test uses the Chi-Square Statistic Test (χ 2 ) to determine the similarity of the curves (Suprayitno et al, 2017) as it is written below.
H0 : χ 2 ≤ χ0 2 : means that the curve is similar to the curve of total population H1 : χ 2 > χ0 2 : means that the curve is different with the curve of total population Where :  (Suprayitno et al. 2016;.
A certain number of sample was taken, and take it as a refence or consider as a population. From this considered population a certain number sample were taken, and the sample is calculated as the precentage of the population. Then, the sample proportion distribution curve was checked against the reference proportion distribution curve, wether both can be consider as the same. If the two curve are same, it can be concluded the the sample size, measured in precentage, is sufficient .
The results of the experiment reported in the journals contained errors, as it is presented as follows .
 The confidence level is set as 95%, and the maximum acceptable error is set as 10%.  The population is 50 persons.
 Sampling population is 40 persons (80% of the total population)  Access distance interval are set as follow : 0 -3 Km, 3 -6 Km, 6 -9 Km, and 9 -12 Km.  The Goodness of Fit Test calculation gave the following result.  The χ 2 < χ0 2 (1.376 < 2) means that the sample curve can be considered as the same as the population curve (see Figure 3).  The mean of absolute error was 9% .  It can be concluded that 80% sample size is sufficient.

Minimum Sample Size Calculation by using the SR Method
The SR Method is developed for Calculating the Minimum Sample Size for the case of Proportion Distribution. The SR Method use a combination of Goodness of Fit Test and Maximum Acceptable Error principle. Therefore this SR Method use the MAECCL principle : Maximum Acceptable Error at a Certain Confidence Level. The Goodness of Fit use the Chi-square (χ 2 ) statistics test, while the Maximum Acceptable Error use the value of Average Absolute Error. By employing the Maximum Acceptable Error test, the lack of only using the Goodness of Fit test is compensated. If the Sample satisfy those two test : Goodness of Fit test and Maximum Acceptable Error test, it can be considered that the Number of Sample is sufficient. The SR name is taken from the author's fist letter of their name (Suprayitno et al. 2017). The statistical test used are as follows. A trial for using the SR Method has been done for the case of the access distance to BRI Kertajaya Surabaya from home for the bank employee (Suprayitno et al. 2017). A reference population of 50 persons was taken with a Confidence Level of 90% at Maximum Acceptable Error (MAE) of 10%. Four sampling level of 90%, 80%, 70% and 60% were investigated. For each percentage of sample, 3 samples were collected, namely Sample A, Sample B and Sample C. The calculation is presented in Table 6. The experiment shows the following result.
 The 90% sampling conveyed that the curve was the same with the 100% population curve (H0: χ 2 ≤ χ0 2 ). The χ 2 was 0.031, 0.288, 0.344 or smaller than the χ0 2 = 5.99. The three average error values (ē) calculated were smaller than the MAE.  The 80% sampling conveyed that the curve was the same with the 100% population curve (H0: χ 2 ≤ χ0 2 ). The three absolute average error values (ē) were les than the MAE.  The 70% sampling showed that not all curves were similar to the population curve. The value χ 2 Ai > χ0 2 is 6,670 > 5.99. The average absolute error value (ē) in the sample. 70A exceeded the maximum error limit (MAE), i.e 11.8% > 10%.  The 60% sampling showed that the curve was the same with the 100% population curve (H0 : χ 2 ≤ χ0 2 ). The average absolute error value (ē) for 60A 60B 60C exceeds the MAE. The calculation result on χ 2 and |ē| using SR Method, for 80% sample, is presented in the following Table 6. It can be concluded that the SR Method is so far appropriate for Calculating the Minimum Sample Size for Proportion Characteristic Case. The sample proportion distribution can be checked against that of the population and the error can be checked not to exceed a certain value, the MAECCL.

SR Method Utilization Trial for Commuter Train Passenger Travel Behavior
The apropriate method, the SR Method, to calculate the minimal number of sample for the proportion case, vary more than pq proportion, has been found. Now, the SR Method will be tried to calculate the minimum number of sample for Travel Behavior in Surabaya Commuter Train.
A travel behavior survey has been done on SUPOR (Surabaya -Porong) commuter train in Surabaya. The survey was carried out on September 19, 2018, at 5:30 a.m. to 5:05 a.m., for Porong -Surabaya direction, with 106 passengers as the total population. Data on the "reason for riding commuter train" were taken. The questionaire prepared 5 reasons, including the others. Thus the proportion distribution consists of pqrst. The survey was done on-board. The data were presented in tables and curves a proportion distribution (see Table 7 and Figure 5). Three different 90% samples of the 106 individus population were taken to test wether the 90% sample can satisfay the Minimal Number of Sample for this case. The three diffrent 90% samples are called 90A, 90B, and 90C. Those three were taken from the 90% upper, 90% middle, and 90% bottom population. The proportion distribution of the population and the three 90% samples is presented in Figure 6. The Statistical Test, Goodness of Fit and Maximum Acceptable Error then conducted. The calculation result is presented in Table 8. The Chi-Square (χ 2 ) test indicated that all the tree χ 2 ≤ χ0 2 , the χ 2 values of 90A (1.665), 90B (0.660), and 90C (0.249) were less than χ0 2 (3.95%) or 7.81. It means that the three sample curve is the same as the population curve.
Meanwhile the MAE calculation indicated that for Sample 90A the error value (ē) was 11.440, more than MAE (accepted error) of 10. It can be concluded that the 90% Sample is slightly satisfaisant.

Overview of Searching Steps in Finding Calculation Method
Finally, finding the appropriate method for calculating the minimum number of sample for the travel behavior survey, for commuter train passenger, required several trials as sshown in Figure 7.