Penerapan Regresi Logistik, K-NN, dan Naïve Bayes Berbasis Pendekatan CRISP-DM dalam Memprediksi Penyakit Jantung


  • Rayna Shera Chang Universitas Ciputra, Surabaya, Indonesia
  • Natalie Grace Widjaja Kuswanto Universitas Ciputra, Surabaya, Indonesia
  • Jessica Laurentia Tedja Universitas Ciputra, Surabaya, Indonesia
  • Christopher Andreas * Mail Universitas Ciputra, Surabaya, Indonesia
  • (*) Corresponding Author
Keywords: CRISP-DM; K-Nearest Neighbors (K-NN); Naïve Bayes; Heart Disease; Logistic Regression

Abstract

Heart disease remains the leading cause of mortality globally, despite having significant potential to be controlled through early detection and effective risk-factor management. To improve the accuracy and efficiency of early detection, machine learning technology is employed to develop predictive models for heart disease risk.  The research aims to compare the performance of three classification algorithms in predicting heart disease risk to identify the most optimal model. This research applies the CRISP-DM methodology to build and compare predictive models for heart disease risk using three supervised learning algorithms: K-Nearest Neighbors (K-NN), Naïve Bayes, and Logistic Regression. The dataset used is a heart disease dataset obtained from the Kaggle platform, consisting of 10,000 records with variables such as Age, Blood Pressure, Smoking, Diabetes, Cholesterol, Triglyceride Level, Fasting Blood Sugar, and CRP Level. For the K-NN model, experiments were conducted using three values of k (k = 5, k = 10, and k = 20) to examine the effect of the number of neighbors on model performance. Meanwhile, the Naïve Bayes and Logistic Regression models were implemented using default parameters without additional tuning to ensure a consistent performance comparison. Model performance was evaluated using Accuracy and F1-Score metrics. The evaluation results indicate that the K-NN model with k = 5 achieved the best performance, with an accuracy of 0.7203 and an F1-Score of 0.7598, outperforming the Naïve Bayes and Logistic Regression models.

Downloads

Download data is not yet available.

References

Kementerian Kesehatan Republik Indonesia, “Penyakit Jantung Penyebab Utama Kematian, Kemenkes Perkuat Layanan Primer,” Kementerian Kesehatan Republik Indonesia, 2022. [Online]. Available: https://kemkes.go.id/eng/penyakit-jantung-penyebab-utama-kematian-kemenkes-perkuat-layanan-primer

R. Fadil, “Gambaran Profil Lipid pada Pasien Penderita Jantung Koroner di RSPAD Gatot Soebroto,” Bab 1 dalam Laporan Penelitian, 2024.

H. Hidayat, A. Sunyoto, and H. Al Fatta, “Klasifikasi Penyakit Jantung Menggunakan Random Forest Classifier,” J. SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), vol. 7, no. 1, pp. 31–40, 2023, doi: 10.47970/siskom-kb.v7i1.464.

S. Yusuf, P. Joseph, S. Rangarajan, S. Islam, A. Mente, P. Hystad, M. Brauer, V. R. Kutty, R. Gupta, A. Wielgosz, K. F. AlHabib, A. Dans, P. Lopez-Jaramillo, A. Avezum, F. Lanas, A. Oguz, I. M. Kruger, R. Diaz, K. Yusoff, R. Kelishadi, P. Mony et al., “Modifiable risk factors, cardiovascular disease and mortality in 155,722 individuals from 21 high-, middle-, and low-income countries,” The Lancet, vol. 395, no. 10226, pp. 795–808, 2020, doi: 10.1016/S0140-6736(19)32008-2.

A. H. Anwar, “Sistematic Review Faktor Resiko Penyakit Jantung Koroner di Indonesia,” Indonesian Journal of Health Research Innovation (IJHRI), vol. 2, no. 1, pp. 57-69, 2025, doi: https://doi.org/10.64094/fqanc998

R. Naifa Saniy, Y. Tarida Sheevana Sitorus, N. Angela Meyana, S. Najwa, A. Apriliyanti Pravitasari, and F. Indrayatna, “Penerapan Algoritma K-Nearest Neighbor pada Klasifikasi Penyakit Jantung”, BIAS, vol. 2022, no. 1, pp. 222–229, 2023, doi: https://doi.org/10.1234/bias.v2022i1.192

R. Helilintar, R. A. Ramadhani, and S. Rochana, Data Mining: K-Nearest Neighbor, Kediri: Fakultas Teknik Universitas Nusantara PGRI Kediri, 2017. [Online]. Available: https://www.researchgate.net/publication/321804055_DATA_MINING_K-Nearest_Neighbor

K. M. Mohi Uddin, R. Ripa, N. Yeasmin, N. Biswas, and S. K. Dey, “Machine learning-based approach to the diagnosis of cardiovascular disease using a combined dataset,” Intelligent Medicine, vol. 7, pp. 1–15, 2023, doi: https://doi.org/10.1016/j.ibmed.2023.100100

A. A. Surya and Y. Yamasari, “Array,” J. Informatics and Computer Science (JINACS), vol. 5, no. 3, pp. 447–455, 2024, doi: 10.26740/jinacs.v5n03.p447-455.

I. Amal, “Analisis Deteksi Dini Penyakit Jantung dengan Pendekatan Regresi Logistik pada Data Pasien,” Skripsi, Universitas Muhammadiyah Makassar, 2024.

I. S. Karima, “Penerapan Machine Learning untuk memprediksi Resiko Pengidap Penyakit Jantung menggunakan Algoritma Decision Tree,” Jurnal Ilmiah Teknik Informatika, vol. 14, no. 1, pp. 73–81, 2025, doi: http://dx.doi.org/10.22441/format.2025.v14.i1.007

E. Retnoningsih and R. Pramudita, “Mengenal Machine Learning dengan Teknik Supervised dan Unsupervised Learning Menggunakan Python,” Bina Insani ICT J., vol. 7, no. 2, p. 156, 2020, doi: 10.51211/biict.v7i2.1422.

O. Ördekçi, “Heart disease,” Kaggle, https://www.kaggle.com/datasets/oktayrdeki/heart-disease

M. M. Baharuddin, H. Azis, and T. Hasanuddin, “Analisis performa metode K-Nearest Neighbor untuk identifikasi jenis kaca,” Ilkom Jurnal Ilmiah, vol. 11, no. 3, pp. 269–274, 2019, doi: https://doi.org/10.33096/ilkom.v11i3.489.269-274

S. Heristian, “Perbandingan Algoritma Machine Learning pada Klasifikasi Penyakit Jantung,” J. Infortech, vol. 6, no. 1, pp. 46–51, 2024, doi: 10.31294/infortech.v6i1.21888.

A. Ratnasari, J. Wahidin, A. E. Setiawan, and P. Bintoro, “Machine Learning untuk Klasifikasi Penyakit Jantung,” Aisyah J. Informatics and Electrical Engineering (A.J.I.E.E), vol. 6, no. 1, pp. 145–150, 2024, doi: 10.30604/jti.v6i1.272.

D. Fabiyanto and Z. Pratama Putra, “Validasi Efektivitas Logistic Regression untuk Diagnosa Penyakit Jantung melalui Pendekatan Machine Learning,” Jurnal Ilmiah FIFO, vol. 16, no. 2, pp. 158–167, 2024, doi: 10.22441/fifo.2024.v16i2.006.

A. Yulandari, S. K. Nur, and A. Hernita, “Perbandingan Metode Decision Tree, Naïve Bayes, dan K-Nearest Neighbor (KNN) untuk Meningkatkan Akurasi Algoritma Machine Learning dalam Memprediksi Heart Disease (Penyakit Jantung),” Madani: J. Ilmiah Multidisiplin, vol. 2, no. 11, pp. 529–536, 2024, doi: 10.5281/zenodo.14377870.

L. Hakim, A. Sobri, L. Sunardi, and D. Nurdiansyah, “Prediksi Penyakit Jantung Berbasis Machine Learning dengan Menggunakan Metode K-NN,” J. Digital Teknol. Inform., vol. 7, no. 2, pp. 14–20, 2024, doi: https://doi.org/10.32502/digital.v7i2.9429


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Penerapan Regresi Logistik, K-NN, dan Naïve Bayes Berbasis Pendekatan CRISP-DM dalam Memprediksi Penyakit Jantung

Dimensions Badge
Article History
Submitted: 2025-10-13
Published: 2026-03-31
Abstract View: 0 times
PDF Download: 0 times
How to Cite
Chang, R., Kuswanto, N., Tedja, J., & Andreas, C. (2026). Penerapan Regresi Logistik, K-NN, dan Naïve Bayes Berbasis Pendekatan CRISP-DM dalam Memprediksi Penyakit Jantung. Building of Informatics, Technology and Science (BITS), 7(4), 2771-2779. https://doi.org/10.47065/bits.v7i4.8518
Issue
Section
Articles