Klasifikasi Risiko Diabetes Mellitus Menggunakan K-Nearest Neighbors dengan Peningkatan Performa Melalui Teknik Oversampling ADASYN
Abstract
Diabetes mellitus is a chronic metabolic disease with a continuously increasing global prevalence. Early detection of diabetes risk is crucial to reduce long-term health complications and the associated healthcare costs. However, a major challenge in applying machine learning models to medical data is the issue of class imbalance, which can lead to model bias toward the majority class. This study aims to develop a diabetes risk classification model by integrating the K-Nearest Neighbors (KNN) algorithm with the Adaptive Synthetic Sampling (ADASYN) technique to address the class imbalance problem. The dataset used was obtained from the Kaggle platform, containing 2,000 patient samples with nine predictive features. Data preprocessing was performed through missing value imputation, outlier handling using winsorizing, and feature normalization using StandardScaler. ADASYN was applied to generate adaptive synthetic samples for the minority class, and the KNN model was trained and evaluated using confusion matrix, precision, recall, F1-Score, accuracy, and ROC-AUC metrics. The results indicate that the implementation of ADASYN improved the ROC-AUC Score by 5.48% (from 91.34% to 96.82%) and the overall accuracy by 2.50% (from 81.50% to 84.00%). The F1-Score for the Diabetes class also increased by 0.40%. The integration of KNN and ADASYN has proven effective in enhancing model performance for detecting high-risk diabetes patients and improving sensitivity toward the minority class.
Downloads
References
D. Priyantini, N. A. Sari, and P. A. Hanggitriana, “Indeks Massa Tubuh Pada Penderita Diabetes Melitus Dengan Nilai Ankle Brachial Index,” J. Ilm. Keperawatan Stikes Hang Tuah Surabaya, vol. 17, no. 2, pp. 144–149, 2022.
A. Aminuddin, Yenny Sima, Nurril Cholifatul Izza, Nur Syamsi Norma Lalla, and Darmi Arda, “Edukasi Kesehatan Tentang Penyakit Diabetes Melitus bagi Masyarakat,” Abdimas Polsaka, pp. 7–12, 2023, doi: 10.35816/abdimaspolsaka.v2i1.25.
M. A. Sembiring, H. Saputra, R. A. Yusda, S. Sutarman, and E. B. Nababan, “Performance of Robust Support Vector Machine Classification Model on Balanced, Imbalanced and Outliers Datasets,” JITK (Jurnal Ilmu Pengetah. dan Teknol. Komputer), vol. 10, no. 1, pp. 208–215, 2024, doi: 10.33480/jitk.v10i1.5272.
N. Maulidah, R. Supriyadi, D. Y. Utami, F. N. Hasan, A. Fauzi, and A. Christian, “Prediksi Penyakit Diabetes Melitus Menggunakan Metode Support Vector Machine dan Naive Bayes,” Indones. J. Softw. Eng., vol. 7, no. 1, pp. 63–68, 2021, doi: 10.31294/ijse.v7i1.10279.
G. Abdurrahman, “Klasifikasi Penyakit Diabetes Melitus Menggunakan Adaboost Classifier,” JUSTINDO (Jurnal Sist. Teknol. Inf. Indones., vol. 7, no. 1, pp. 59–66, 2022.
L. Safitri and Z. Fatah, “Implementasi Prediksi Penyakit Diabetes Menggunakan Metode Decision Tree,” JUSIFOR J. Sist. Inf. dan Inform., vol. 2, no. 2, pp. 125–132, 2023, doi: 10.70609/jusifor.v3i2.5788 .
S. P. Nainggolan and A. Sinaga, “Comparative Analysis of Accuracy of Random Forest and Gradient Boosting Classifier Algorithm for Diabetes Classification,” Sebatik, vol. 27, no. 1, pp. 97–102, 2023, doi: 10.46984/sebatik.v27i1.2157.
P. R. Putri and R. Alit, “Klasifikasi Penyakit Diabetes menggunakan Metode Support Vector Machine,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 6, no. 3, pp. 740–746, 2024.
A. Yogianto, A. Homaidi, and Z. Fatah, “Implementasi Metode K-Nearest Neighbors (KNN) untuk Klasifikasi Penyakit Jantung,” G-Tech J. Teknol. Terap., vol. 8, no. 3, pp. 1720–1728, 2024, doi: 10.33379/gtech.v8i3.4495.
S. G. Barus, “Klasifikasi Sentimen Data Tidak Seimbang Menggunakan Algoritma SMOTE dan K-Nearest Neighbor Pada Ulasan Pengguna Aplikasi Pedulilindungi,” in Seminar Nasional Mahasiswa Ilmu Komputer dan Aplikasinya (SENAMIKA), 2022, pp. 162–173.
M. T. T. B. Sirait, N. S. Fathonah, and M. N. Fauzan, “Pemanfaatan Algoritma ADASYN dan Support Vector Machine Dalam Meningkatkan Akurasi Prediksi Kanker Paru-Paru,” JATI (Jurnal Mhs. Tek. Inform., vol. 8, no. 5, pp. 8773–8778, 2024.
M. I. Anugrah, J. Zeniarja, and D. S. Setiawan, “Peningkatan Performa Model Hard Voting Classifier dengan Teknik Oversampling ADASYN pada Penyakit Diabetes,” Edumatic J. Pendidik. Inform., vol. 8, no. 1, pp. 290–299, 2024, doi: 10.29408/edumatic.v8i1.25838.
R. I. Borman, R. Napianto, N. Nugroho, D. Pasha, Y. Rahmanto, and Y. E. P. Yudoutomo, “Implementation of PCA and KNN Algorithms in the Classification of Indonesian Medicinal Plants,” in International Conference on Computer Science, Information Technology and Electrical Engineering (ICOMITEE), IEEE, 2021, pp. 46–50.
J. Dasilva, “Diabetes Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/johndasilva/diabetes
I. Ahmad, A. Tri Prastowo, R. Indra Borman, M. Tonggiroh, and Y. Jusman, “Lung Cancer Classification from CT Scan Images Using the LVQ Algorithm and GLCM Feature Extraction with Spatial Filters,” in International Conference on Information Technology and Computing (ICITCOM), IEEE, 2024, pp. 99–104.
I. O. Muraina, “Ideal Dataset Splitting Ratios in Machine Learning Algorithms: General Concerns for Data Scientists and Data Analysts,” in International Mardin Artuklu Scientific Researches Conference, 2022, pp. 496–505.
L. Muflikhah, F. A. Bachtiar, D. E. Ratnawati, and R. Darmawan, “Improving Performance for Diabetic Nephropathy Detection Using Adaptive Synthetic Sampling Data in Ensemble Method of Machine Learning Algorithms,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 10, no. 1, p. 123, 2024, doi: 10.26555/jiteki.v10i1.28107.
I. P. Putri, “Analisis Performa Metode K- Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indones. J. Data Sci., vol. 2, no. 1, pp. 21–28, 2021, doi: 10.33096/ijodas.v2i1.25.
Z. Abidin, R. I. Borman, F. B. Ananda, P. Prasetyawan, F. Rossi, and Y. Jusman, “Classification of Indonesian Traditional Snacks Based on Image Using Convolutional Neural Network (CNN) Algorithm,” in International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), IEEE, 2022, pp. 18–23.
Y. Liu, Y. Li, and D. Xie, “Implications of imbalanced datasets for empirical ROC-AUC estimation in binary classification tasks,” J. Stat. Comput. Simul., vol. 94, no. 1, pp. 183–203, Jan. 2024, doi: 10.1080/00949655.2023.2238235.
I. Pratama, A. Y. Chandra, and P. T. Presetyaningrum, “Seleksi Fitur dan Penanganan Imbalanced Data menggunakan RFECV dan ADASYN,” J. Eksplora Inform., vol. 11, no. 1, pp. 38–49, 2022, doi: 10.30864/eksplora.v11i1.578.
R. A. Nurdian, Mujib Ridwan, and Ahmad Yusuf, “Komparasi Metode SMOTE dan ADASYN dalam Meningkatkan Performa Klasifikasi Herregistrasi Mahasiswa Baru,” J. Tek. Inform. dan Sist. Inf., vol. 8, no. 1, pp. 24–32, 2022, doi: 10.28932/jutisi.v8i1.4004.
S. Sumarlinda and W. Lestari, “Aplikasi K-Nearest Neighbor (KNN) untuk Klasifikasi Penyakit Kardiovaskuler,” in Sumarlinda, Sri Lestari, Wiji, 2022, pp. 259–262.
M. N. Maskuri, K. Sukerti, and R. M. Herdian Bhakti, “Penerapan Algoritma K-Nearest Neighbor (KNN) untuk Memprediksi Penyakit Stroke Stroke Desease Predict Using KNN Algorithm,” J. Ilm. Intech Inf. Technol. J. UMUS, vol. 4, no. 1, pp. 130–140, 2022.
A. Razaki, Y. H. Chrisnanto, and M. Melina, “Penanganan Outlier Pada Metode Algoritma K- Nearest Neighbors (KNN) Dengan Metode Kernel Density Estimation Pada Kasus Penyakit Diabetes,” INTECOMS J. Inf. Technol. Comput. Sci., vol. 7, no. 4, pp. 1177–1188, 2024, doi: 10.31539/intecoms.v7i4.10866.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Klasifikasi Risiko Diabetes Mellitus Menggunakan K-Nearest Neighbors dengan Peningkatan Performa Melalui Teknik Oversampling ADASYN
Pages: 2238-2247
Copyright (c) 2025 Muhammad Bagir, Hendra Mayatopani, Umbar Riyanto, Dedy Alamsyah

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).






















