Penerapan Regresi Logistik, K-NN, dan Naïve Bayes Berbasis Pendekatan CRISP-DM dalam Memprediksi Penyakit Jantung
Abstract
Heart disease remains the leading cause of mortality globally, despite having significant potential to be controlled through early detection and effective risk-factor management. To improve the accuracy and efficiency of early detection, machine learning technology is employed to develop predictive models for heart disease risk. The research aims to compare the performance of three classification algorithms in predicting heart disease risk to identify the most optimal model. This research applies the CRISP-DM methodology to build and compare predictive models for heart disease risk using three supervised learning algorithms: K-Nearest Neighbors (K-NN), Naïve Bayes, and Logistic Regression. The dataset used is a heart disease dataset obtained from the Kaggle platform, consisting of 10,000 records with variables such as Age, Blood Pressure, Smoking, Diabetes, Cholesterol, Triglyceride Level, Fasting Blood Sugar, and CRP Level. For the K-NN model, experiments were conducted using three values of k (k = 5, k = 10, and k = 20) to examine the effect of the number of neighbors on model performance. Meanwhile, the Naïve Bayes and Logistic Regression models were implemented using default parameters without additional tuning to ensure a consistent performance comparison. Model performance was evaluated using Accuracy and F1-Score metrics. The evaluation results indicate that the K-NN model with k = 5 achieved the best performance, with an accuracy of 0.7203 and an F1-Score of 0.7598, outperforming the Naïve Bayes and Logistic Regression models.
Downloads
References
Kementerian Kesehatan Republik Indonesia, “Penyakit Jantung Penyebab Utama Kematian, Kemenkes Perkuat Layanan Primer,” Kementerian Kesehatan Republik Indonesia, 2022. [Online]. Available: https://kemkes.go.id/eng/penyakit-jantung-penyebab-utama-kematian-kemenkes-perkuat-layanan-primer
R. Fadil, “Gambaran Profil Lipid pada Pasien Penderita Jantung Koroner di RSPAD Gatot Soebroto,” Bab 1 dalam Laporan Penelitian, 2024.
H. Hidayat, A. Sunyoto, and H. Al Fatta, “Klasifikasi Penyakit Jantung Menggunakan Random Forest Classifier,” J. SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), vol. 7, no. 1, pp. 31–40, 2023, doi: 10.47970/siskom-kb.v7i1.464.
S. Yusuf, P. Joseph, S. Rangarajan, S. Islam, A. Mente, P. Hystad, M. Brauer, V. R. Kutty, R. Gupta, A. Wielgosz, K. F. AlHabib, A. Dans, P. Lopez-Jaramillo, A. Avezum, F. Lanas, A. Oguz, I. M. Kruger, R. Diaz, K. Yusoff, R. Kelishadi, P. Mony et al., “Modifiable risk factors, cardiovascular disease and mortality in 155,722 individuals from 21 high-, middle-, and low-income countries,” The Lancet, vol. 395, no. 10226, pp. 795–808, 2020, doi: 10.1016/S0140-6736(19)32008-2.
A. H. Anwar, “Sistematic Review Faktor Resiko Penyakit Jantung Koroner di Indonesia,” Indonesian Journal of Health Research Innovation (IJHRI), vol. 2, no. 1, pp. 57-69, 2025, doi: https://doi.org/10.64094/fqanc998
R. Naifa Saniy, Y. Tarida Sheevana Sitorus, N. Angela Meyana, S. Najwa, A. Apriliyanti Pravitasari, and F. Indrayatna, “Penerapan Algoritma K-Nearest Neighbor pada Klasifikasi Penyakit Jantung”, BIAS, vol. 2022, no. 1, pp. 222–229, 2023, doi: https://doi.org/10.1234/bias.v2022i1.192
R. Helilintar, R. A. Ramadhani, and S. Rochana, Data Mining: K-Nearest Neighbor, Kediri: Fakultas Teknik Universitas Nusantara PGRI Kediri, 2017. [Online]. Available: https://www.researchgate.net/publication/321804055_DATA_MINING_K-Nearest_Neighbor
K. M. Mohi Uddin, R. Ripa, N. Yeasmin, N. Biswas, and S. K. Dey, “Machine learning-based approach to the diagnosis of cardiovascular disease using a combined dataset,” Intelligent Medicine, vol. 7, pp. 1–15, 2023, doi: https://doi.org/10.1016/j.ibmed.2023.100100
A. A. Surya and Y. Yamasari, “Array,” J. Informatics and Computer Science (JINACS), vol. 5, no. 3, pp. 447–455, 2024, doi: 10.26740/jinacs.v5n03.p447-455.
I. Amal, “Analisis Deteksi Dini Penyakit Jantung dengan Pendekatan Regresi Logistik pada Data Pasien,” Skripsi, Universitas Muhammadiyah Makassar, 2024.
I. S. Karima, “Penerapan Machine Learning untuk memprediksi Resiko Pengidap Penyakit Jantung menggunakan Algoritma Decision Tree,” Jurnal Ilmiah Teknik Informatika, vol. 14, no. 1, pp. 73–81, 2025, doi: http://dx.doi.org/10.22441/format.2025.v14.i1.007
E. Retnoningsih and R. Pramudita, “Mengenal Machine Learning dengan Teknik Supervised dan Unsupervised Learning Menggunakan Python,” Bina Insani ICT J., vol. 7, no. 2, p. 156, 2020, doi: 10.51211/biict.v7i2.1422.
O. Ördekçi, “Heart disease,” Kaggle, https://www.kaggle.com/datasets/oktayrdeki/heart-disease
M. M. Baharuddin, H. Azis, and T. Hasanuddin, “Analisis performa metode K-Nearest Neighbor untuk identifikasi jenis kaca,” Ilkom Jurnal Ilmiah, vol. 11, no. 3, pp. 269–274, 2019, doi: https://doi.org/10.33096/ilkom.v11i3.489.269-274
S. Heristian, “Perbandingan Algoritma Machine Learning pada Klasifikasi Penyakit Jantung,” J. Infortech, vol. 6, no. 1, pp. 46–51, 2024, doi: 10.31294/infortech.v6i1.21888.
A. Ratnasari, J. Wahidin, A. E. Setiawan, and P. Bintoro, “Machine Learning untuk Klasifikasi Penyakit Jantung,” Aisyah J. Informatics and Electrical Engineering (A.J.I.E.E), vol. 6, no. 1, pp. 145–150, 2024, doi: 10.30604/jti.v6i1.272.
D. Fabiyanto and Z. Pratama Putra, “Validasi Efektivitas Logistic Regression untuk Diagnosa Penyakit Jantung melalui Pendekatan Machine Learning,” Jurnal Ilmiah FIFO, vol. 16, no. 2, pp. 158–167, 2024, doi: 10.22441/fifo.2024.v16i2.006.
A. Yulandari, S. K. Nur, and A. Hernita, “Perbandingan Metode Decision Tree, Naïve Bayes, dan K-Nearest Neighbor (KNN) untuk Meningkatkan Akurasi Algoritma Machine Learning dalam Memprediksi Heart Disease (Penyakit Jantung),” Madani: J. Ilmiah Multidisiplin, vol. 2, no. 11, pp. 529–536, 2024, doi: 10.5281/zenodo.14377870.
L. Hakim, A. Sobri, L. Sunardi, and D. Nurdiansyah, “Prediksi Penyakit Jantung Berbasis Machine Learning dengan Menggunakan Metode K-NN,” J. Digital Teknol. Inform., vol. 7, no. 2, pp. 14–20, 2024, doi: https://doi.org/10.32502/digital.v7i2.9429
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Penerapan Regresi Logistik, K-NN, dan Naïve Bayes Berbasis Pendekatan CRISP-DM dalam Memprediksi Penyakit Jantung
Pages: 2771-2779
Copyright (c) 2026 Rayna Shera Chang, Natalie Grace Widjaja Kuswanto, Jessica Laurentia Tedja, Christopher Andreas

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















