Klasifikasi Penyakit Jantung Tipe Kardiovaskular Menggunakan Adaptive Synthetic Sampling dan Algoritma Extreme Gradient Boosting
Abstract
Cardiovascular diseases are conditions that commonly affect the cardiovascular system, such as heart disease and stroke. According to data from the World Health Organization (WHO), 17.9 million deaths worldwide in 2019 were attributable to cardiovascular disease. Early detection is crucial, but diagnosing heart disease is complex in developing countries due to the limited availability of diagnostic tools and medical personnel. This study uses the Heart Disease Dataset from Kaggle, consisting of 15 attributes and 4238 records, to develop a heart disease classification model using XGBoost. The research stages include data imputation, data transformation using LabelEncoder, data balancing using ADASYN, data splitting (80% training data, 20% testing data), and hyperparameter tuning with Bayesian Optimization. The results show that the XGBoost model with ADASYN performs better, with a ROC-AUC of 0.971 and an accuracy of 0.916, compared to the model without ADASYN, which has a ROC-AUC of 0.698 and an accuracy of 0.841. Based on the research results, ADASYN has proven effective in improving model performance on imbalanced datasets. Additionally, Bayesian Optimization plays an important role in finding the optimal parameter combination, which can further enhance model performance. With this research, the impact is quite significant in the development of early detection methods for cardiovascular heart disease, particularly through the application of the XGBoost classification algorithm
Downloads
References
N. L. K. A. Arsani, N. P. D. S. Wahyuni, N. N. M. Agustin, and M. Budiawan, “Deteksi Dini dan Pencegahan Penyakit Kardiovaskular,” Proceeding Senadimas Undiksha, vol. 1, no. 1, pp. 663–668, 2022.
J. P. Pane, L. Simorangkir, and P. I. S. B. Saragih, “Faktor-Faktor Risiko Penyakit Kardivaskular Berbasis Masyarakat,” J. Penelit. Perawat Prof., vol. 4, no. 4, pp. 1183–1192, 2022.
D. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone,” BMC Med. Inform. Decis. Mak., vol. 20, no. 1, pp. 1–16, 2020, doi: 10.1186/s12911-020-1023-5.
W. Nugraha, “Prediksi Penyakit Jantung Cardiovascular Menggunakan Model Algoritma Klasifikasi,” J. Manag. dan Inform., vol. 9, no. 2, pp. 3–8, 2021.
A. M. A. Rahim, Inggrid Yanuar Risca Pratiwi, and Muhammad Ainul Fikri, “Klasifikasi Penyakit Jantung Menggunakan Metode Synthetic Minority Over-Sampling Technique Dan Random Forest Clasifier,” Indones. J. Comput. Sci., vol. 12, no. 5, pp. 2995–3011, 2023, doi: 10.33022/ijcs.v12i5.3413.
K. Budholiya, S. K. Shrivastava, and V. Sharma, “An optimized XGBoost based diagnostic system for effective prediction of heart disease,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 7, pp. 4514–4523, 2022, doi: 10.1016/j.jksuci.2020.10.013.
K. Erdem, M. B. YILDIZ, E. T. YASIN, and M. Koklu, “A Detailed Analysis of Detecting Heart Diseases Using Artificial Intelligence Methods,” Intell. Methods Eng. Sci., no. December, 2023, doi: 10.58190/imiens.2023.71.
C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A Comparative Analysis of XGBoost,” no. February, 2019, doi: 10.1007/s10462-020-09896-5.
R. Gupta, H. Bansal, A. K. Singh, N. Bansal, and A. Saini, “An Efficient Prediction of Cardiovascular Diseases using Machine Learning Models,” 2023 Int. Conf. Network, Multimed. Inf. Technol. NMITCON 2023, no. Ml, pp. 1–6, 2023, doi: 10.1109/NMITCON58196.2023.10276141.
H. Zheng, S. W. A. Sherazi, and J. Y. Lee, “A Stacking Ensemble Prediction Model for the Occurrences of Major Adverse Cardiovascular Events in Patients with Acute Coronary Syndrome on Imbalanced Data,” IEEE Access, vol. 9, pp. 113692–113704, 2021, doi: 10.1109/ACCESS.2021.3099795.
E. S. Ompusunggu, A. Nainggolan, and ..., “Penentuan Kelayakan Promosi Pegawai Menggunakan Algoritma Random Forest Classifier Dan Xgboost Classifier,” … (Teknik Inf. dan …, vol. 6, pp. 773–783, 2023, doi: 10.37600/tekinkom.v6i2.949.
R. D. P. S. W. R. Naomi Nessyana Debataraja, “Penerapan Synthetic Minority Oversampling Technique Dalam Mengatasi Data Tidak Seimbang Pada Metode Classification and Regression Tree,” Bimaster Bul. Ilm. Mat. Stat. dan Ter., vol. 9, no. 1, pp. 231–238, 2020, doi: 10.26418/bbimst.v9i1.38949.
N. P. Y. T. WIJAYANTI, E. N. KENCANA, and I. W. SUMARJAYA, “Smote: Potensi Dan Kekurangannya Pada Survei,” E-Jurnal Mat., vol. 10, no. 4, p. 235, 2021, doi: 10.24843/mtk.2021.v10.i04.p348.
F. Y. Pamuji and S. D. A. Putri, “Komparasi Metode Smote Dan Adasyn Untuk Penanganan Data Tidak Seimbang Multiclass,” J. Inform. Polinema, vol. 9, no. 3, pp. 331–338, 2023, doi: 10.33795/jip.v9i3.1330.
R. A. Maula et al., “Handling Missing Value dengan Pendekatan Regresi pada Dataset Akuakultur Berukuran Kecil,” J. Rekayasa Elektr., vol. 18, no. 3, pp. 175–184, 2022, doi: 10.17529/jre.v18i3.25903.
S. Pushpalatha and A. Stella, “Kidney Disease Diagnosis using Classification Algorithm,” Proc. 5th Int. Conf. I-SMAC (IoT Soc. Mobile, Anal. Cloud), I-SMAC 2021, pp. 1285–1288, 2021, doi: 10.1109/I-SMAC52330.2021.9640879.
S. Doki, S. Devella, S. Tallam, S. S. Reddy Gangannagari, P. Sampathkrishna Reddy, and G. P. Reddy, “Heart Disease Prediction Using XGBoost,” Proc. 2022 3rd Int. Conf. Intell. Comput. Instrum. Control Technol. Comput. Intell. Smart Syst. ICICICT 2022, pp. 1317–1320, 2022, doi: 10.1109/ICICICT54557.2022.9917678.
N. N. Pandika Pinata, I. M. Sukarsa, and N. K. Dwi Rusjayanthi, “Prediksi Kecelakaan Lalu Lintas di Bali dengan XGBoost pada Python,” J. Ilm. Merpati (Menara Penelit. Akad. Teknol. Informasi), vol. 8, no. 3, p. 188, 2020, doi: 10.24843/jim.2020.v08.i03.p04.
L. Qadrini, A. Sepperwali, and A. Aina, “Decision Tree Dan Adaboost Pada Klasifikasi Penerima Program Bantuan Sosial,” J. Inov. Penelit., vol. 2, no. 7, pp. 1959–1966, 2021.
M. S. Mohosheu, F. Abrar Shams, M. A. Al Noman, S. R. Abir, and Al-Amin, “ROC Based Performance Evaluation of Machine Learning Classifiers for Multiclass Imbalanced Intrusion Detection Dataset,” 8th Int. Conf. Recent Adv. Innov. Eng. Empower. Comput. Anal. Eng. Through Digit. Innov. ICRAIE 2023, vol. 2023, pp. 1–6, 2023, doi: 10.1109/ICRAIE59459.2023.10468177.
R. Suprayoga, S. Zega, Muhathir, and S. Mardiana, “Classification of Mango Leaf Diseases Using XGBoost Method and HoG Feature Extraction,” Proc. ICMERALDA 2023 - Int. Conf. Model. E-Information Res. Artif. Learn. Digit. Appl., pp. 197–202, 2023, doi: 10.1109/ICMERALDA60125.2023.10458172.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Klasifikasi Penyakit Jantung Tipe Kardiovaskular Menggunakan Adaptive Synthetic Sampling dan Algoritma Extreme Gradient Boosting
Pages: 499-508
Copyright (c) 2024 Acep Handika Permana, Fajri Rakhmat Umbara, Fatan Kasyidi

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















