Analisis Performa K-Nearest Neighbor dengan Optimasi F1-Score dan Teknik SMOTE dalam Klasifikasi Risiko Serangan Jantung
Abstract
Heart attack is one of the leading causes of death worldwide, making early risk prediction essential for improving patient outcomes. However, many medical datasets suffer from class imbalance, where the number of high-risk cases is significantly smaller than normal cases. This condition may cause machine learning models to be biased toward the majority class and reduce their ability to detect high-risk patients. This study aims to analyze the performance of the K-Nearest Neighbor (KNN) algorithm optimized using F1-score and combined with the Synthetic Minority Over-sampling Technique (SMOTE) for heart attack risk classification. The dataset used is the Heart Attack Dataset, which consists of numerical and categorical features. The research applies an experimental approach by developing a machine learning pipeline that includes data preprocessing, missing value handling, feature standardization, oversampling using SMOTE, and hyperparameter optimization through GridSearchCV with F1-score as the main evaluation metric. Model evaluation is conducted using Stratified 5-Fold Cross-Validation with accuracy, precision, recall, F1-score, and ROC-AUC metrics. The results show that the baseline KNN model achieves an accuracy of 98.50%, precision 95.27%, recall 81.47%, and ROC-AUC 0.9278. Meanwhile, the KNN model integrated with SMOTE attains a recall of 87.27% and ROC-AUC of 0.9484, indicating improved detection of heart attack cases and a reduction in false negatives by 31%, although precision decreases to 72.15%. These findings demonstrate that the integration of SMOTE and hyperparameter optimization effectively improves model sensitivity, making it more suitable for medical applications that prioritize patient safety.
Downloads
References
S. Wan, F. Wan, X. D.-A. of C. Diseases, and undefined 2025, “Machine learning approaches for cardiovascular disease prediction: A review,” ElsevierS Wan, F Wan, X DaiArchives of Cardiovascular Diseases, 2025•Elsevier, 2025, doi: 10.1016/j.acvd.2025.04.055.
I. Akbar, F. Supriadi, and D. I. Junaedi, “Pemanfaatan Machine Learning Di Bidang Kesehatan,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 1, pp. 1744–1749, Jan. 2025, doi: 10.36040/jati.v9i1.12663.
J. Yang and J. Guan, “A Heart Disease Prediction Model Based on Feature Optimization and Smote-Xgboost Algorithm,” Information 2022, Vol. 13, vol. 13, no. 10, Oct. 2022, doi: 10.3390/info13100475.
M. S. Simanjuntak, R. Robet, and L. Hoki, “A Comparative Study of Machine Learning and Deep Learning Models for Heart Disease Classification,” Journal of Applied Informatics and Computing, vol. 9, no. 6, pp. 3405–3409, Dec. 2025, doi: 10.30871/jaic.v9i6.11546.
N. Tyagi and P. Jain, “A Review of Machine Learning Algorithms for Predicting Heart Disease,” 2024 2nd International Conference on Disruptive Technologies, ICDT 2024, pp. 961–965, 2024, doi: 10.1109/ICDT61202.2024.10488917.
A. K. Yadav, R. Shukla, and T. R. Singh, “Machine learning in expert systems for disease diagnostics in human healthcare,” Machine Learning, Big Data, and IoT for Medical Informatics, pp. 179–200, Jan. 2021, doi: 10.1016/B978-0-12-821777-1.00022-7.
C. Boukhatem, H. Y. Youssef, and A. B. Nassif, “Heart Disease Prediction Using Machine Learning,” 2022 Advances in Science and Engineering Technology International Conferences, ASET 2022, 2022, doi: 10.1109/ASET53988.2022.9734880.
A. Ishaq et al., “Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques,” IEEE Access, vol. 9, pp. 39707–39716, 2021, doi: 10.1109/ACCESS.2021.3064084.
N. Nasution, F. B. Nasution, M. A. Hasan, and L. Kuning, “Predicting Heart Disease Using Machine Learning: An Evaluation of Logistic Regression, Random Forest, SVM, and KNN Models on the UCI Heart Disease Dataset,” IT Journal Research and Development, vol. 9, no. 2, pp. 140–150, Apr. 2025, doi: 10.25299/itjrd.2025.17941.
E. Richardson, R. Trevizani, J. A. Greenbaum, H. Carter, M. Nielsen, and B. Peters, “The receiver operating characteristic curve accurately assesses imbalanced datasets,” Patterns, vol. 5, no. 6, p. 100994, Jun. 2024, doi: 10.1016/j.patter.2024.100994.
D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, Jan. 2020, doi: 10.1186/s12864-019-6413-7.
S. S. Yadav* and G. P. Bhole, “Learning from Imbalanced Data in Classification,” International Journal of Recent Technology and Engineering (IJRTE), vol. 8, no. 5, pp. 1907–1016, Jan. 2020, doi: 10.35940/ijrte.e6286.018520.
A. H. Shaker, I. A. Ibrahim, and S. K. Gharghan, “Machine learning techniques for cardiovascular disease detection through heart sound analysis: A review,” AIP Conf. Proc., vol. 3232, no. 1, Oct. 2024, doi: 10.1063/5.0236263.
S. P. Aulia, B. Rahmat, and A. Junaidi, “Enhancing Heart Disease Prediction through SMOTE-ENN Balancing and RFECV Feature Selection,” Journal of Artificial Intelligence and Engineering Applications (JAIEA), vol. 4, no. 3, pp. 1968–1973, Jun. 2025, doi: 10.59934/jaiea.v4i3.1057.
M. Kavitha, G. Gnaneswar, R. Dinesh, Y. R. Sai, and R. S. Suraj, “Heart Disease Prediction using Hybrid machine Learning Model,” Proceedings of the 6th International Conference on Inventive Computation Technologies, ICICT 2021, pp. 1329–1333, Jan. 2021, doi: 10.1109/ICICT50816.2021.9358597.
M. Rahardi, B. P. Asaddulloh, A. Aminuddin, F. F. Abdulloh, I. Saifudin, and F. P. Kusumawijaya, “Optimizing Machine Learning Models for Class Imbalance in Heart Disease Prediction,” Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23599–23604, Jun. 2025, doi: 10.48084/etasr.10407.
J. J. Wibowo, D. A. Kristiyanti, and J. Wiratama, “Enhancing Heart Disease Classification: A Comparative Analysis of SMOTE and Naïve Bayes on Imbalanced Data,” JOIV : International Journal on Informatics Visualization, vol. 9, no. 5, pp. 2072–2079, Sep. 2025, doi: 10.62527/joiv.9.5.3248.
S. Akinola, R. Leelakrishna, and V. Varadarajan, “Enhancing cardiovascular disease prediction: A hybrid machine learning approach integrating oversampling and adaptive boosting techniques,” AIMS Med. Sci., vol. 11, no. 2, pp. 58–71, 2024, doi: 10.3934/medsci.2024005.
S. Gupta, A. Tripathi, and C. Srivastava, “Machine Learning Models for Heart Disease Prediction: Balancing Accuracy and Transparency,” 2nd IEEE International Conference on IoT, Communication and Automation Technology, ICICAT 2024, pp. 1605–1611, 2024, doi: 10.1109/ICICAT62666.2024.10922965.
D. Purwanto, S. C. Hidayati, D. I. Ricoida, and K. A. Putri, “Optimized Machine Learning Models for Heart Disease Prediction: A Performance Analysis,” Proceedings - 2024 International of Seminar on Application for Technology of Information and Communication: Smart And Emerging Technology for a Better Life, iSemantic 2024, pp. 559–562, 2024, doi: 10.1109/iSemantic63362.2024.10762500.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Analisis Performa K-Nearest Neighbor dengan Optimasi F1-Score dan Teknik SMOTE dalam Klasifikasi Risiko Serangan Jantung
Pages: 2586-2596
Copyright (c) 2026 Fikri Luqman Pratama, Muhamad Akrom

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















