Penerapan Synthetic Minority Oversampling Technique terhadap Data Perokok Anak di Nusa Tenggara Barat Tahun 2021

Rahma Mutiara Sari, Achmad Prasetyo


Indonesia is ranked as the country with the highest number of young smokers in Southeast Asia. This situation is very worrying considering the negative impact of smoking can cause various health problems and even lead to death. West Nusa Tenggara Province has the highest percentage of children who smoke in Indonesia in 2021 at 2.28%. Data on children's smoking status is identified as unbalanced data because the ratio between children who smoke and do not smoke is very lame. Therefore, the binary logistic regression analysis method of the Synthetic Minority Oversampling Technique approach was applied to handle the problem. This study aims to determine an overview and identify variables that influence children's smoking behavior in West Nusa Tenggara in 2021 and their trends. The data used in this study are secondary data from the 2021 National Socio-Economic Survey with the unit analysis of children aged 5 to 17 years in West Nusa Tenggara in 2021. The results showed that gender, economic status, age, status of region of residence, education level of the head of household, and schooling status influenced children's smoking behavior in West Nusa Tenggara in 2021 with children who didnt attend school having the greatest tendency to smoke.


imbalance data; child smoking behavior; logistic regression; SMOTE

