Optimalisasi Rasio Data pada K-Nearest Neighbor untuk Klasifikasi Multikelas Tingkat Obesitas Populasi Dewasa


  • Dini Aprilia Langnegara Universitas Bina Sarana Informatika, Jakarta, Indonesia
  • Titik Misriati * Mail Universitas Bina Sarana Informatika, Jakarta, Indonesia https://orcid.org/0009-0001-6410-4226
  • Imam Nawawi Universitas Bina Sarana Informatika, Jakarta, Indonesia
  • (*) Corresponding Author
Keywords: Obesity; KNN; Classification; Multiclass; Data Ratio

Abstract

Obesity is a complex health issue that needs a strategy for assessing its severity to facilitate earlier recognition. One can determine an individual's obesity classification by analyzing their dietary habits, level of physical activity, and overall health status. This research aims to ascertain the K-Nearest Neighbor (KNN) algorithm's efficacy in accurately classifying seven various phases of obesity. The dataset employed for predicting obesity consisted of 2,111 samples drawn from a population of both genders. For KNN testing, the dataset was divided into training and test data, with the test data allocated over three separate scenarios, including varying ratios. The ratios of 70:30, 80:20, and 90:10 were utilized in these circumstances, respectively. The value of k was varied from k=2 to k=10. The optimal configuration was achieved with a 90:10 data split ratio and a k value of 2, as evidenced by the test results. This setup concurrently attained an accuracy of 90.05%, a precision of 90.56%, a recall of 89.80%, and an F1 score of 90.18%. This categorization error was most prominent when comparing the Normal Weight category to the Class I Overweight group. A properly preprocessed KNN algorithm can attain competitive accuracy over 90 percent in classifying population obesity levels, as demonstrated by this study's findings.

References

A. N. M. S. Islam, H. Sultana, Md. Nazmul Hassan Refat, Z. Farhana, A. Abdulbasah Kamil, and M. Meshbahur Rahman, “The global burden of overweight-obesity and its association with economic status, benefiting from STEPs survey of WHO member states: A meta-analysis,” Prev. Med. Rep., vol. 46, p. 102882, Oct. 2024, doi: 10.1016/j.pmedr.2024.102882.

R. Ramadhanti and B. Besral, “Socio-Demographic, Dietary and Lifestyle Determinants of Central Obesity Among Adults in Java, Indonesia,” Jurnal Ilmu Kesehatan Masyarakat, vol. 16, no. 3, pp. 367–382, Nov. 2025, doi: 10.26553/jikm.2025.16.3.367-382.

M. A. Nagi et al., “Economic costs of obesity: a systematic review,” Int. J. Obes., vol. 48, no. 1, pp. 33–43, Jan. 2024, doi: 10.1038/s41366-023-01398-y.

H. Gozukara Bag et al., “Estimation of Obesity Levels through the Proposed Predictive Approach Based on Physical Activity and Nutritional Habits,” Diagnostics, vol. 13, no. 18, p. 2949, Sep. 2023, doi: 10.3390/diagnostics13182949.

D. D. Solomon et al., “Hybrid Majority Voting: Prediction and Classification Model for Obesity,” Diagnostics, vol. 13, no. 15, p. 2610, Aug. 2023, doi: 10.3390/diagnostics13152610.

S. H. Alanazi, M. Abdollahian, L. Tafakori, kheriah A. Almulaihan, S. M. ALruwili, and O. F. ALenazi, “Predicting age at onset of childhood obesity using regression, Random Forest, Decision Tree, and K-Nearest Neighbour—A case study in Saudi Arabia,” PLoS One, vol. 19, no. 9, p. e0308408, Sep. 2024, doi: 10.1371/journal.pone.0308408.

A. I. Putri et al., “Implementation of K-Nearest Neighbors, Naïve Bayes Classifier, Support Vector Machine and Decision Tree Algorithms for Obesity Risk Prediction,” Public Research Journal of Engineering, Data Technology and Computer Science, vol. 2, no. 1, pp. 26–33, Apr. 2024, doi: 10.57152/predatecs.v2i1.1110.

A. I. Putri et al., “Implementation of K-Nearest Neighbors, Naïve Bayes Classifier, Support Vector Machine and Decision Tree Algorithms for Obesity Risk Prediction,” Public Research Journal of Engineering, Data Technology and Computer Science, vol. 2, no. 1, pp. 26–33, Apr. 2024, doi: 10.57152/predatecs.v2i1.1110.

S. Q. F. Yasin, A. W. Widodo, and I. Indriati, “Klasifikasi Tingkat Obesitas Berdasarkan Pola Hidup dan Kebiasaan Konsumsi Makanan menggunakan meotde K-Nearest Neighbor,” Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, vol. 9, no. 3, pp. 1–6, 2025.

C. U. Meida, “Perbandingan Jarak Euclidean dan Manhattan Menggunakan Metode K-Nearest Neighbors dengan Multi-Class Confusion Matrix,” Institut Teknologi Sumatera, Lampung, 2024.

A. Riyandi, Mahazam Afrad, M Yoka Fathoni, and Yogo Dwi Prasetyo, “Obesity Status Prediction Through Artificial Intelligence and Balanced Label Distribution Using SMOTE,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 9, no. 3, pp. 519–524, Jun. 2025, doi: 10.29207/resti.v9i3.6204.

N. Koklu and S. A. Sulak, “Using Artificial Intelligence Techniques for the Analysis of Obesity Status According to the Individuals’ Social and Physical Activities,” Sinop Üniversitesi Fen Bilimleri Dergisi, vol. 9, no. 1, pp. 217–239, Jun. 2024, doi: 10.33484/sinopfbd.1445215.

I. Nozad Mahmood Mahmood and S. Sami Mohammed, “Predicting Obesity Levels Based on Lifestyle and Activity Patterns,” NTU Journal of Engineering and Technology, vol. 4, no. 2, Jun. 2025, doi: 10.56286/4patch899.

S. C. Medaramatla, S. Pattan, S. H. Srungavarapu, and V. Srilakshmi, “Obesity Risk Prediction Using Fusion Ensembling Methods,” Boletim da Sociedade Paranaense de Matemática, vol. 43, no. 4, 2025.

R. Khumbar, “Obesity Prediction Dataset,” Kaggle.

H. Meng, “A Comparative Study on Missing Value Imputation Techniques in Machine Learning,” SHS Web of Conferences, vol. 218, p. 02014, Jul. 2025, doi: 10.1051/shsconf/202521802014.

H. Z. Mojahid, J. M. Zain, M. Yusoff, A. Basit, A. K. Jumaat, And M. Ali, “Examining The Impact Of Feature Selection Techniques On Machine And Deep Learning Models For The Prediction Of Covid-19,” Malaysian Journal of Computing, vol. 10, no. 1, pp. 2135–2158, Apr. 2025, doi: 10.24191/mjoc.v10i1.4475.

P. Bidye, P. Borkar, and N. Rakesh, “High performance GPU implementation of KNN algorithm: A review,” MethodsX, vol. 15, p. 103633, Dec. 2025, doi: 10.1016/j.mex.2025.103633.

M. Sabri et al., “A Novel Classification Algorithm Based on the Synergy Between Dynamic Clustering with Adaptive Distances and K-Nearest Neighbors,” J. Classif., vol. 41, no. 2, pp. 264–288, Jul. 2024, doi: 10.1007/s00357-024-09471-5.

Assoc. Prof. X. Samarov and Z. Barotova, “A Robust Hybrid Model Based on ANN and KNN for Multi Class Network Attack Detection and Classification,” International Journal of Inventive Engineering and Sciences, vol. 12, no. 9, pp. 1–6, Sep. 2025, doi: 10.35940/ijies.H1115.12090925.

S. Helmiyah and R. Pramestiawan, “Analisis Komparatif Algoritma Machine Learning dengan Metrik Akurasi, Presisi, Recall, dan F1-Score pada Dataset Kacang Kering,” Jurnal Ilmu Komputer dan Teknologi, vol. 6, no. 3, pp. 152–159, Oct. 2025, doi: 10.35960/ikomti.v6i3.2031.

S. Szabó, I. J. Holb, V. É. Abriha-Molnár, G. Szatmári, S. K. Singh, and D. Abriha, “Classification Assessment Tool: A program to measure the uncertainty of classification models in terms of class-level metrics,” Appl. Soft Comput., vol. 155, p. 111468, Apr. 2024, doi: 10.1016/j.asoc.2024.111468.

G. Zeng, “Invariance Properties and Evaluation Metrics Derived from the Confusion Matrix in Multiclass Classification,” Mathematics, vol. 13, no. 16, p. 2609, Aug. 2025, doi: 10.3390/math13162609.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Optimalisasi Rasio Data pada K-Nearest Neighbor untuk Klasifikasi Multikelas Tingkat Obesitas Populasi Dewasa

Dimensions Badge
Article History
Published: 2026-03-31
Abstract View: 0 times
pdf Download: 0 times
Issue
Section
Articles