Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung


  • Muhammad Syarief Albani Universitas Sriwijaya, Palembang, Indonesia
  • Dedy Kurniawan * Mail Universitas Sriwijaya, Palembang, Indonesia
  • Ken Ditha Tania Universitas Sriwijaya, Palembang, Indonesia
  • (*) Corresponding Author
Keywords: Machine Learning; Heart Disease; Ensemble Model

Abstract

This study compared the performance of nine machine learning algorithms in predicting heart disease using a dataset dating back to 1988 and consisting of four databases: Cleveland, Hungary, Switzerland, and Long Beach totaling 1025 data. The dataset used includes medical features that reflect physiological states, clinical examination results, and cardiovascular risk factors, namely age, gender, type of chest pain, resting blood pressure, serum cholesterol levels, fasting blood sugar levels, resting electrocardiography results, maximum heart rate, chest pain during physical activity, ST segment depression, ST segment slope, number of major blood vessels visible by fluoroscopy, and thalassemia status. The stages of this study include data cleaning, data transformation, and evaluation carried out using the data splitting method for training and testing as well as K-fold cross-validation with metrics of accuracy, precision, recall, F1 score, and AUC-ROC. The algorithms used in this study are Decision Tree, Random Forest, Support Vector Machine, MLP Classifier, Bagging Classifier, Gradient Boosting, CatBoost, XGBoost, and LightGBM with ensemble-based models, such as CatBoost, Random Forest, XGBoost, and LightGBM, showing consistent performance on various evaluation metrics when compared to non-ensemble models. Among all models tested, CatBoost showed the best performance, with an accuracy reaching 98%, an F1-Score of 0.980, and a Recall of 0.9875 then followed by other ensemble algorithms such as Random Forest, XGBoost and LightGBM. The results of this study indicate that ensemble models are proven to be more effective in predicting heart disease. This study aims to present an in-depth comparative study of the performance of ensemble algorithms and modern machine learning in predicting heart disease, as well as enriching the literature related to the application of Knowledge Discovery in the health sector and providing a basis for selecting more reliable prediction algorithms to support clinical decision making and the development of machine learning-based heart disease diagnosis support systems.

Downloads

Download data is not yet available.

References

R. Luengo-Fernandez et al., “Cardiovascular disease burden due to productivity losses in European Society of Cardiology countries,” Eur. Heart J. Qual. Care Clin. Outcomes, vol. 10, no. 1, pp. 36–44, 2024, doi: 10.1093/ehjqcco/qcad031.

G. E. Mandoli, L. Spaccaterra, E. Carluccio, and R. M. Inciardi, “Editorial: Methods in diagnosing heart failure,” Front. Cardiovasc. Med., vol. 11, no. January, pp. 2–4, 2024, doi: 10.3389/fcvm.2024.1365006.

L. Z. H. Jansen and K. E. Bennin, “A machine learning algorithm for personalized healthy and sustainable grocery product recommendations,” International Journal of Information Management Data Insights, vol. 5, no. 1, p. 100303, 2025, doi: 10.1016/j.jjimei.2024.100303.

F. Alqurashi and I. Ahmad, “A data-driven multi-perspective approach to cybersecurity knowledge discovery through topic modelling,” Alexandria Engineering Journal, vol. 107, no. June, pp. 374–389, 2024, doi: 10.1016/j.aej.2024.07.044.

H. Amadou Boubacar et al., “HeartPredict algorithm: Machine intelligence for the early detection of heart failure,” Intell. Based. Med., vol. 5, p. 100044, 2021, doi: 10.1016/j.ibmed.2021.100044.

D. Amanda Ardhani and K. D. Tania, “Knowledge Discovery on E-Commerce Customer Churn Using Interpretable Machine Learning: A Comparative Study of SHAP-Based Classifiers,” Journal of Applied Informatics and Computing, vol. 9, no. 5, pp. 2695–2702, 2025, doi: 10.30871/jaic.v9i5.10811.

C. Andini Bahri and K. Ditha Tania, “Perbandingan Kinerja LSTM, Random Forest, dan SVR Berbasis Knowledge Discovery untuk Prediksi Harga Beras Sumatera Selatan,” Jurnal Riset Komputer), vol. 12, no. 5, pp. 2407–389, 2025, doi: 10.30865/jurikom.v12i5.9140.

A. Davinka, S. Depari, K. D. Tania, and P. E. Sevtiyuni, “Penerapan Metode Machine Learning Dan Teknik SMOTE untuk Prediksi Diabetes,” vol. 7, pp. 436–447, 2025, doi: 10.30865/json.v7i2.9032.

Muhammad Raviansyah, Andika Amansyah, Farhan Fadhilah, Sumanto Sumanto, Imam Budiawan, and Roida Pakpahan, “Komparasi Algoritma Machine Learning (Random Forest, Gradient Boosting, dan Ada Boosting) untuk Prediksi Tingkat Penyakit Alzheimer,” Jurnal Teknik Informatika dan Teknologi Informasi, vol. 5, no. 3, pp. 131–145, 2025, doi: 10.55606/jutiti.v5i3.6227.

N. H. Alfajr, G. Garno, and D. Yusup, “Studi Komparasi Algoritma Random Forest Classifier Dan Support Vector Machine Dalam Prediksi Penyakit Jantung,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 13, no. 3, pp. 22–30, 2025, doi: 10.23960/jitet.v13i3.6569.

D. Lapp, “Heart Disease Dataset,” Kaggle, 2018. [Online]. Available: https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset

E. Naroum et al., “Comparative analysis of deep learning and machine learning techniques for forecasting new malaria cases in Cameroon’s Adamaoua region,” Intell. Based. Med., vol. 11, no. February, p. 100220, 2025, doi: 10.1016/j.ibmed.2025.100220.

A. Alzayed, W. Almayyan, and A. Al-Hunaiyyan, “Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods,” International Journal of Artificial Intelligence & Applications, vol. 13, no. 02, pp. 37–54, 2022, doi: 10.5121/ijaia.2022.13203.

Ö. Bezek Güre, “Comparison of the Performance of Gradient Boosting and Extreme Gradient Boosting Methods in Classifying Timms Science Achievement,” Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, vol. 14, no. 2, pp. 1041–1059, 2025, doi: 10.17798/bitlisfen.1636812.

H. Wang and L. Cheng, “CatBoost model with synthetic features in application to loan risk assessment of small businesses,” 2021, [Online]. Available: http://arxiv.org/abs/2106.07954

A. Izotova and A. Valiullin, “Comparison of Poisson process and machine learning algorithms approach for credit card fraud detection,” Procedia Comput. Sci., vol. 186, pp. 721–726, 2021, doi: 10.1016/j.procs.2021.04.214.

G. Ke et al., “LightGBM: A highly efficient gradient boosting decision tree,” Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Nips, pp. 3147–3155, 2017.

R. Atangana, D. Tchiotsop, G. Kenne, and L. C. DjoufackNkengfac k, “EEG Signal Classification using LDA and MLP Classifier,” Health Informatics - An International Journal, vol. 9, no. 1, pp. 14–32, 2020, doi: 10.5121/hiij.2020.9102.

B. T. Jijo and A. M. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” Journal of Applied Science and Technology Trends, vol. 2, no. 1, pp. 20–28, 2021, doi: 10.38094/jastt20165.

S. Acharya, T. Kar, U. C. Samal, and P. K. Patra, “Performance Comparison between SVM and LS-SVM for Rice Leaf Disease detection,” EAI Endorsed Transactions on Scalable Information Systems, vol. 10, no. 6, pp. 1–7, 2023, doi: 10.4108/eetsis.3940.

X. Deng, H. Shao, L. Shi, X. Wang, and T. Xie, “A classification–detection approach of COVID-19 based on chest X-ray and CT by using keras pre-trained deep learning models,” CMES - Computer Modeling in Engineering and Sciences, vol. 125, no. 2, pp. 579–596, 2020, doi: 10.32604/cmes.2020.011920.

Muhamad Fadli and Rizal Adi Saputra, “Klasifikasi dan Evaluasi Performa Model Random Forest untuk Prediksi Stroke,” JT: Jurnal Teknik, vol. 12, no. 2, pp. 72–80, 2023, [Online]. Available: http://jurnal.umt.ac.id/index.php/jt/index


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung

Dimensions Badge
Article History
Submitted: 2025-09-09
Published: 2026-03-19
Abstract View: 58 times
PDF Download: 43 times
How to Cite
Albani, M., Kurniawan, D., & Tania, K. (2026). Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung. Building of Informatics, Technology and Science (BITS), 7(4), 2618-2628. https://doi.org/10.47065/bits.v7i4.8346
Issue
Section
Articles

Most read articles by the same author(s)