Analysis of Stunting Prediction in Toddlers in Bekasi District Using Random Forest and Naïve Bayes

Chintya Annisah Solin; Putu Harry Gunawan

doi:10.47065/bits.v6i4.6670

Chintya Annisah Solin * Telkom University, Bandung, Indonesia
Putu Harry Gunawan Telkom University, Bandung, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/bits.v6i4.6670

Keywords: Stunting; Naive Bayes; Random Forest; Adasyn; K-fold

Abstract

This study aims to compare the performance of the Random Forest and Naïve Bayes algorithms in predicting stunting in toddlers using data from the Bekasi District Health Office. The analysis process begins with data cleaning, normalization, and sampling using the Adaptive Synthetic Sampling (ADASYN) method to handle data imbalance, followed by validation with Stratified K-Fold Cross Validation. The implementation of the algorithm shows that Random Forest has the highest accuracy of 89.62% and an F1-Score of 89.09%. Naïve Bayes Gaussian produces an accuracy of 88.72% and an F1-Score of 88.81%, while Naïve Bayes Bernoulli has a lower performance with an accuracy of 67.83% and an F1-Score of 69.72%. Random Forest shows advantages in overcoming noise and imbalanced data, making it an optimal choice for stunting prediction. Meanwhile, the performance of Naïve Bayes is influenced by the characteristics of the data, where the Gaussian variation is more suitable for continuous data. The results of this study provide insight that choosing the right algorithm, especially on imbalanced data, is very important to improve prediction accuracy. This study also recommends more attention to data preprocessing to ensure optimal prediction quality, especially for minority classes.

Downloads

Download data is not yet available.

References

H. Hatijar, “The incidence of stunting in infants and toddlers,” Sandi Husada Health Scientific JournalVol. 12 No. 1 pp. 224–229, 2023, doi:10.35816/jskh.v12I1.1019.

N. D. Yanti, F. Betriana, and I. R. Kartika, "Factors Causing Stunting in Children: A Literature Review," Real In Nursing Journal, vol. 3, no. 1, pp. 1–10, 2020, doi: 10.32883/rnj.v3i1.447.

D. Husnaniyah, D. Yulyanti, and R. Rudiansyah, "The relationship between maternal education level and stunting incidence," The Indonesian Journal of Health Science, vol. 12, no. 1, pp. 57–64, 2020, doi: 10.32528/ijhs.v12i1.4857.

T. A. E. Permatasari, Y. Chadirin, T. S. Yuliani, and S. Koswara, "Empowerment of Posyandu Cadres in Local Food-Based Organic Food Fortication as an Effort to Prevent Stunting in Toddlers," Journal of Engineering Community Service, vol. 4, no. 1, pp. 1–10, 2021, doi: 10.24853/jpmt.4.1.1-10.

I. P. Putri, T. Terttiaavini, and N. Arminarahmah, "Comparative Analysis of Machine Learning Algorithms for Predicting Stunting in Children," MALCOM: Indonesian Journal of Machine Learning and Computer ScienceVol. 4 No. 1 pp. 257–265, Jan. 2024, doi: 10.57152/malkam.v4I1.1078.

Fadellia Azzahra, N. Suarna, and Y. Arie Wijaya, "Application of Random Forest and Cross Validation Algorithms for Stunting Data Prediction," Kopertip : Scientific Journal of Informatics and Computer Management, vol. 8, no. 1, pp. 1–6, Feb. 2024, doi: 10.32485/kopertip.v8i1.238.

M. G. Daffa and P. H. Gunawan, “Stunting Classification Analysis for Toddlers in Bojongsoang: A Data-Driven Approach,” in 2024 2nd International Conference on Software Engineering and Information Technology (ICoSEIT), IEEE, 2024, pp. 42–46. doi: 10.1109/ICoSEIT60086.2024.10497515.

R. Supriyadi, W. Gata, N. Maulidah, and A. Fauzi, "Application of Random Forest Algorithm to Determine the Quality of Red Wine," E-Business: Scientific Journal of Economics and Business, vol. 13, no. 2, pp. 67–75, 2020, doi: 10.51903/e-business.v13i2.247.

L. Ratnawati and D. R. Sulistyaningrum, "Application of random forest to measure the severity of disease in apple leaves," ITS Journal of Science and Art, vol. 8, no. 2, pp. A71–A77, 2020, doi: 10.12962/j23373520.v8i2.48517.

A. A. Santika, T. H. Saragih, and M. Muliadi, "Application of Likert Scale to the Classification of Brilink Agent Customer Satisfaction Levels Using Random Forest," JUSTIN (Journal of Information Systems and Technology), vol. 11, no. 3, pp. 405–411, 2023, doi:10.26418/justin.v11i3.62086.

M. M. Mutoffar, M. Naseer, and A. Fadillah, "Classification of well water quality using random forest algorithm," Narrative: National Journal of Research, Applications and Informatics EngineeringVol. 4 No. 2 pp. 138–146, 2022, doi:10.53580/nartif.v4i2.160.

I. Kurniawan, D. C. P. Buani, A. Abdussomad, W. Apriliah, and R. A. Saputra, "Implementation of Random Forest Algorithm to Determine Raskin Aid Recipients," Journal of Information Technology and Computer ScienceVol. 10 No. 2 pp. 421–428, 2023, doi:10.25126/jatic.20231026225.

J. Pratama, F. Fauziah, and I. D. Sholihati, "K-Nearest Neighbor and Naive Bayes Method in Determining the Nutritional Status of Toddlers," Brahmin: Journal of the Application of Artificial Intelligence, vol. 4, no. 2, pp. 214–221, 2023, doi: 10.30645/brahmana.v4i2.197.g196.

A. F. Watratan and D. Moeis, "Implementation of Naive Bayes Algorithm to Predict the Rate of Spread of Covid-19 in Indonesia," Journal of Applied Computer Science and Technology, vol. 1, no. 1, pp. 7–14, 2020, doi: 10.52158/jacost.v1i1.9.

R. Ramadhani and R. Ramadhanu, "Machine Learning Method for Classification of Toddler Nutrition Data with Naïve Bayes, KNN and Decision Tree Algorithms," Symmetrical: Journal of Mechanical Engineering, Electrical and Computer Science, vol. 15, no. 1, 2024, doi: 10.24176/simet.v15i1.10679.

B. Rahman, F. Fauzi, and S. Amri, “Perbandingan Hasil Klasifikasi Data Iris menggunakan Algoritma K-Nearest Neighbor dan Random Forest: Comparison of Iris Data Classification Results using the K-Nearest Neighbor and Random Forest Algorithms,” Journal Of Data Insights, Vol. 1, No. 1, pp. 19–26, 2023, Yogurt: 10.26714/Jodi.V1I1.135.

U. Ungkawa and M. A. Rafi, “Data Balancing Techniques Using the PCA-KMeans and ADASYN for Possible Stroke Disease Cases,” Informatics Online Journal, vol. 9, no. 1, pp. 138–147, Jun. 2024, doi: 10.15575/join.v9i1.1293.

C. G. Tekkali and K. Natarajan, “An advancement in AdaSyn for imbalanced learning: An application to fraud detection in digital transactions,” Journal of Intelligent & Fuzzy Systems, vol. 46, pp. 11381–11396, 2024, doi: 10.3233/JIFS-236392.

S. Prusty, S. Patnaik, and S. K. Dash, “SKCV: Stratified K-fold cross-validation on ML classifiers for predicting cervical cancer,” Frontiers in Nanotechnology, vol. 4, Aug. 2022, doi: 10.3389/fnano.2022.972421.

S. Szeghalmy and A. Fazekas, “A Comparative Study of the Use of Stratified Cross-Validation and Distribution-Balanced Stratified Cross-Validation in Imbalanced Learning,” Sensors, vol. 23, no. 4, Feb. 2023, doi: 10.3390/s23042333.

A. Nugroho and D. Harini, "Random Forest Techniques to Improve Unbalanced Data Accuracy," JSTIK, vol. 2, no. 2, 2024, doi: 10.53624/jsitik.v2i2.XX.

Z. P. Agusta and Adiwijaya, “Modified balanced random forest for improving imbalanced data prediction,” International Journal of Advances in Intelligent Informatics, vol. 5, no. 1, pp. 58–65, Mar. 2019, doi: 10.26555/ijain.v5i1.255.

Y. Yusnida Lase et al., "Bulletin of Information Technology (BIT) Predicting the Impact of Hybrid Learning Using Naive Bayes," vol. 4, no. 4, pp. 425–429, 2023, doi: 10.47065/bit.v3i1.

N. S. Abd and D. A. Abdullah, “Diagnose of Chronic Kidney Diseases by Using Naive Bayes Algorithm,” Journal of Al-Qadisiyah for Computer Science and Mathematics, vol. 13, no. 2, Jul. 2021, doi: 10.29304/jqcm.2021.13.2.819.

I. Cholissodin et al., “Development of big data app for classification based on map reduce of naive Bayes with or without web and mobile interface by RESTful API using Hadoop and spark,” Journal of Information Technology and Computer Science, vol. 5, no. 3, pp. 302–312, 2020, doi: 10.25126/jitecs.202053233.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Analysis of Stunting Prediction in Toddlers in Bekasi District Using Random Forest and Naïve Bayes

Analysis of Stunting Prediction in Toddlers in Bekasi District Using Random Forest and Naïve Bayes

Abstract

Downloads

References

Most read articles by the same author(s)