Klasifikasi Penyakit Jantung Tipe Kardiovaskular Menggunakan Adaptive Synthetic Sampling dan Algoritma Extreme Gradient Boosting


  • Acep Handika Permana Universitas Jenderal Achmad Yani, Cimahi, Indonesia
  • Fajri Rakhmat Umbara * Mail Universitas Jenderal Achmad Yani, Cimahi, Indonesia
  • Fatan Kasyidi Universitas Jenderal Achmad Yani, Cimahi, Indonesia
  • (*) Corresponding Author
Keywords: Heart Disease; Cardiovascular; Classification; ADASYN; XGBoost

Abstract

Cardiovascular diseases are conditions that commonly affect the cardiovascular system, such as heart disease and stroke. According to data from the World Health Organization (WHO), 17.9 million deaths worldwide in 2019 were attributable to cardiovascular disease. Early detection is crucial, but diagnosing heart disease is complex in developing countries due to the limited availability of diagnostic tools and medical personnel. This study uses the Heart Disease Dataset from Kaggle, consisting of 15 attributes and 4238 records, to develop a heart disease classification model using XGBoost. The research stages include data imputation, data transformation using LabelEncoder, data balancing using ADASYN, data splitting (80% training data, 20% testing data), and hyperparameter tuning with Bayesian Optimization. The results show that the XGBoost model with ADASYN performs better, with a ROC-AUC of 0.971 and an accuracy of 0.916, compared to the model without ADASYN, which has a ROC-AUC of 0.698 and an accuracy of 0.841. Based on the research results, ADASYN has proven effective in improving model performance on imbalanced datasets. Additionally, Bayesian Optimization plays an important role in finding the optimal parameter combination, which can further enhance model performance. With this research, the impact is quite significant in the development of early detection methods for cardiovascular heart disease, particularly through the application of the XGBoost classification algorithm

Downloads

Download data is not yet available.

References

N. L. K. A. Arsani, N. P. D. S. Wahyuni, N. N. M. Agustin, and M. Budiawan, “Deteksi Dini dan Pencegahan Penyakit Kardiovaskular,” Proceeding Senadimas Undiksha, vol. 1, no. 1, pp. 663–668, 2022.

J. P. Pane, L. Simorangkir, and P. I. S. B. Saragih, “Faktor-Faktor Risiko Penyakit Kardivaskular Berbasis Masyarakat,” J. Penelit. Perawat Prof., vol. 4, no. 4, pp. 1183–1192, 2022.

D. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone,” BMC Med. Inform. Decis. Mak., vol. 20, no. 1, pp. 1–16, 2020, doi: 10.1186/s12911-020-1023-5.

W. Nugraha, “Prediksi Penyakit Jantung Cardiovascular Menggunakan Model Algoritma Klasifikasi,” J. Manag. dan Inform., vol. 9, no. 2, pp. 3–8, 2021.

A. M. A. Rahim, Inggrid Yanuar Risca Pratiwi, and Muhammad Ainul Fikri, “Klasifikasi Penyakit Jantung Menggunakan Metode Synthetic Minority Over-Sampling Technique Dan Random Forest Clasifier,” Indones. J. Comput. Sci., vol. 12, no. 5, pp. 2995–3011, 2023, doi: 10.33022/ijcs.v12i5.3413.

K. Budholiya, S. K. Shrivastava, and V. Sharma, “An optimized XGBoost based diagnostic system for effective prediction of heart disease,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 7, pp. 4514–4523, 2022, doi: 10.1016/j.jksuci.2020.10.013.

K. Erdem, M. B. YILDIZ, E. T. YASIN, and M. Koklu, “A Detailed Analysis of Detecting Heart Diseases Using Artificial Intelligence Methods,” Intell. Methods Eng. Sci., no. December, 2023, doi: 10.58190/imiens.2023.71.

C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A Comparative Analysis of XGBoost,” no. February, 2019, doi: 10.1007/s10462-020-09896-5.

R. Gupta, H. Bansal, A. K. Singh, N. Bansal, and A. Saini, “An Efficient Prediction of Cardiovascular Diseases using Machine Learning Models,” 2023 Int. Conf. Network, Multimed. Inf. Technol. NMITCON 2023, no. Ml, pp. 1–6, 2023, doi: 10.1109/NMITCON58196.2023.10276141.

H. Zheng, S. W. A. Sherazi, and J. Y. Lee, “A Stacking Ensemble Prediction Model for the Occurrences of Major Adverse Cardiovascular Events in Patients with Acute Coronary Syndrome on Imbalanced Data,” IEEE Access, vol. 9, pp. 113692–113704, 2021, doi: 10.1109/ACCESS.2021.3099795.

E. S. Ompusunggu, A. Nainggolan, and ..., “Penentuan Kelayakan Promosi Pegawai Menggunakan Algoritma Random Forest Classifier Dan Xgboost Classifier,” … (Teknik Inf. dan …, vol. 6, pp. 773–783, 2023, doi: 10.37600/tekinkom.v6i2.949.

R. D. P. S. W. R. Naomi Nessyana Debataraja, “Penerapan Synthetic Minority Oversampling Technique Dalam Mengatasi Data Tidak Seimbang Pada Metode Classification and Regression Tree,” Bimaster Bul. Ilm. Mat. Stat. dan Ter., vol. 9, no. 1, pp. 231–238, 2020, doi: 10.26418/bbimst.v9i1.38949.

N. P. Y. T. WIJAYANTI, E. N. KENCANA, and I. W. SUMARJAYA, “Smote: Potensi Dan Kekurangannya Pada Survei,” E-Jurnal Mat., vol. 10, no. 4, p. 235, 2021, doi: 10.24843/mtk.2021.v10.i04.p348.

F. Y. Pamuji and S. D. A. Putri, “Komparasi Metode Smote Dan Adasyn Untuk Penanganan Data Tidak Seimbang Multiclass,” J. Inform. Polinema, vol. 9, no. 3, pp. 331–338, 2023, doi: 10.33795/jip.v9i3.1330.

R. A. Maula et al., “Handling Missing Value dengan Pendekatan Regresi pada Dataset Akuakultur Berukuran Kecil,” J. Rekayasa Elektr., vol. 18, no. 3, pp. 175–184, 2022, doi: 10.17529/jre.v18i3.25903.

S. Pushpalatha and A. Stella, “Kidney Disease Diagnosis using Classification Algorithm,” Proc. 5th Int. Conf. I-SMAC (IoT Soc. Mobile, Anal. Cloud), I-SMAC 2021, pp. 1285–1288, 2021, doi: 10.1109/I-SMAC52330.2021.9640879.

S. Doki, S. Devella, S. Tallam, S. S. Reddy Gangannagari, P. Sampathkrishna Reddy, and G. P. Reddy, “Heart Disease Prediction Using XGBoost,” Proc. 2022 3rd Int. Conf. Intell. Comput. Instrum. Control Technol. Comput. Intell. Smart Syst. ICICICT 2022, pp. 1317–1320, 2022, doi: 10.1109/ICICICT54557.2022.9917678.

N. N. Pandika Pinata, I. M. Sukarsa, and N. K. Dwi Rusjayanthi, “Prediksi Kecelakaan Lalu Lintas di Bali dengan XGBoost pada Python,” J. Ilm. Merpati (Menara Penelit. Akad. Teknol. Informasi), vol. 8, no. 3, p. 188, 2020, doi: 10.24843/jim.2020.v08.i03.p04.

L. Qadrini, A. Sepperwali, and A. Aina, “Decision Tree Dan Adaboost Pada Klasifikasi Penerima Program Bantuan Sosial,” J. Inov. Penelit., vol. 2, no. 7, pp. 1959–1966, 2021.

M. S. Mohosheu, F. Abrar Shams, M. A. Al Noman, S. R. Abir, and Al-Amin, “ROC Based Performance Evaluation of Machine Learning Classifiers for Multiclass Imbalanced Intrusion Detection Dataset,” 8th Int. Conf. Recent Adv. Innov. Eng. Empower. Comput. Anal. Eng. Through Digit. Innov. ICRAIE 2023, vol. 2023, pp. 1–6, 2023, doi: 10.1109/ICRAIE59459.2023.10468177.

R. Suprayoga, S. Zega, Muhathir, and S. Mardiana, “Classification of Mango Leaf Diseases Using XGBoost Method and HoG Feature Extraction,” Proc. ICMERALDA 2023 - Int. Conf. Model. E-Information Res. Artif. Learn. Digit. Appl., pp. 197–202, 2023, doi: 10.1109/ICMERALDA60125.2023.10458172.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Klasifikasi Penyakit Jantung Tipe Kardiovaskular Menggunakan Adaptive Synthetic Sampling dan Algoritma Extreme Gradient Boosting

Dimensions Badge
Article History
Submitted: 2024-06-26
Published: 2024-06-30
Abstract View: 0 times
PDF Download: 1 times
How to Cite
Permana, A., Umbara, F., & Kasyidi, F. (2024). Klasifikasi Penyakit Jantung Tipe Kardiovaskular Menggunakan Adaptive Synthetic Sampling dan Algoritma Extreme Gradient Boosting. Building of Informatics, Technology and Science (BITS), 6(1), 499-508. https://doi.org/10.47065/bits.v6i1.5421
Issue
Section
Articles