Klasifikasi Churn Dengan Algoritma Xgboost Menggunakan Feature Selection Boruta-Shap
Abstract
Customer churn is a critical issue for telecommunications companies, as it directly impacts revenue and business sustainability. This study proposes the development of a churn prediction model using the Extreme Gradient Boosting (XGBoost) algorithm combined with the Boruta feature selection method and SHAP (SHapley Additive exPlanations)-based feature interpretation. The dataset used is the Telco Customer Churn dataset from Kaggle, consisting of 7,043 customer records and 21 features. The research stages include data preprocessing, data transformation, an 80:20 train-test split, data balancing using SMOTE, feature selection with Boruta, feature interpretation with SHAP, and classification using XGBoost. The model’s performance was evaluated using accuracy, precision, recall, and F1-score metrics. Results show that the XGBoost model with Boruta-SHAP (Model B) achieved an accuracy of 0.7576, slightly higher than the model without feature selection (Model A), which achieved 0.7512. Model B also demonstrated improved performance for the majority class (non-churn), with recall increasing from 0.76 to 0.79 and F1-score from 0.82 to 0.83. However, for the minority class (churn), recall decreased from 0.72 to 0.66, although precision increased from 0.52 to 0.54. These findings indicate that integrating Boruta-SHAP can enhance model efficiency and interpretability, but additional strategies are required to maintain performance for the minority class.
Downloads
References
J. Ahn, J. Hwang, D. Kim, H. Choi, and S. Kang, “A Survey on Churn Analysis in Various Business Domains,” IEEE Access, vol. 8, pp. 220816–220839, 2020, doi: 10.1109/ACCESS.2020.3042657.
S. M. Shrestha and A. Shakya, “A Customer Churn Prediction Model using XGBoost for the Telecommunication Industry in Nepal,” Procedia Comput Sci, vol. 215, pp. 652–661, 2022, doi: 10.1016/j.procs.2022.12.067.
K. Peng and Y. Peng, “Research on Telecom Customer Churn Prediction Based on GA-XGBoost and SHAP,” Journal of Computer and Communications, vol. 10, no. 11, pp. 107–120, 2022, doi: 10.4236/jcc.2022.1011008.
E. H. Yulianti, O. Soesanto, and Y. Sukmawaty, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) pada Klasifikasi Nasabah Kartu Kredit,” JOMTA Journal of Mathematics: Theory and Applications, vol. 4, no. 1, 2022, doi: https://doi.org/10.31605/jomta.v4i1.1792.
L. N. Wakhidah, A. Khanif Zyen, and B. B. Wahono, “Evaluation of Telecommunication Customer Churn Classification with SMOTE Using Random Forest and XGBoost Algorithms,” Journal of Applied Informatics and Computing (JAIC), vol. 9, no. 1, p. 89, 2025, doi: https://doi.org/10.30871/jaic.v9i1.8740.
X. Chen, X. Qiu, Y. Ma, L. Wang, and L. Fang, “Boruta-XGBoost Electricity Theft Detection Based on Features of Electric Energy Parameters,” J Phys Conf Ser, vol. 2290, no. 1, 2022, doi: 10.1088/1742-6596/2290/1/012121.
M. A. Ganaie, M. Hu, A. K. Malik, M. Tanveer, and P. N. Suganthan, “Ensemble deep learning: A review,” Eng Appl Artif Intell, vol. 115, Aug. 2022, doi: 10.1016/j.engappai.2022.105151.
R. Sinaga and S. Widianto, “Understanding Telecommunication Customer Churn: Insights from LightGBM Predictive Modelling and SHAP Feature Interpretation,” ASEAN Marketing Journal, vol. 15, no. 1, Jun. 2023, doi: 10.7454/amj.v15i1.1229.
E. AKKUR and A. C. ÖZTÜRK, “PREDICTING LUNG CANCER USING EXPLAINABLE ARTIFICIAL INTELLIGENCE AND BORUTA-SHAP METHODS,” Journal of Engineering Sciences, vol. 27(3), pp. 792–803, 2024, doi: https://doi.org/10.17780/ksujes.1425483.
J. Maan and H. Maan, “Customer Churn Prediction Model using Explainable Machine learning,” International Journal of Computer Science Trends and Technology, vol. 11, doi: https://doi.org/10.48550/arXiv.2303.00960.
Y. Li and K. Yan, “Prediction of bank credit customers churn based on machine learning and interpretability analysis,” Data Science in Finance and Economics, vol. 5, no. 1, pp. 19–34, 2025, doi: 10.3934/dsfe.2025002.
Y. Xia, S. Jiang, L. Meng, and X. Ju, “XGBoost-B-GHM: An Ensemble Model with Feature Selection and GHM Loss Function Optimization for Credit Scoring,” Systems, vol. 12, no. 7, Jul. 2024, doi: 10.3390/systems12070254.
S. Keputusan Dirjen Penguatan Riset dan Pengembangan Ristek Dikti, A. Nikmatul Kasanah, U. Pujianto, T. Elektro, F. Teknik, and U. Negeri Malang, “Penerapan Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Objektivitas Berita Online Menggunakan Algoritma KNN,” masa berlaku mulai, vol. 1, no. 3, pp. 196–201, 2017, doi: 10.29207/resti.v3i2.945.
X. Yuan et al., “A novel feature susceptibility approach for a PEMFC control system based on an improved XGBoost-Boruta algorithm,” Energy and AI, vol. 12, Apr. 2023, doi: 10.1016/j.egyai.2023.100229.
A. Farzipour, R. Elmi, and H. Nasiri, “Detection of Monkeypox Cases Based on Symptoms Using XGBoost and Shapley Additive Explanations Methods,” Diagnostics, vol. 13, no. 14, Jul. 2023, doi: 10.3390/diagnostics13142391.
Z. Fan et al., “XGBoost-SHAP-based interpretable diagnostic framework for knee osteoarthritis: a population-based retrospective cohort study,” Arthritis Res Ther, vol. 26, no. 1, Dec. 2024, doi: 10.1186/s13075-024-03450-2.
H. Sahlaoui, E. A. A. Alaoui, A. Nayyar, S. Agoujil, and M. M. Jaber, “Predicting and Interpreting Student Performance Using Ensemble Models and Shapley Additive Explanations,” IEEE Access, vol. 9, pp. 152688–152703, 2021, doi: 10.1109/ACCESS.2021.3124270.
A. A. Saputra, B. N. Sari, C. Rozikin, U. Singaperbangsa, and K. Abstrak, “Penerapan Algoritma Extreme Gradient Boosting (Xgboost) Untuk Analisis Risiko Kredit,” Jurnal Ilmiah Wahana Pendidikan, vol. 10, no. 7, pp. 27–36, 2024, doi: 10.5281/zenodo.10960080.
A. Ibrahem Ahmed Osman, A. Najah Ahmed, M. F. Chow, Y. Feng Huang, and A. El-Shafie, “Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia,” Ain Shams Engineering Journal, vol. 12, no. 2, pp. 1545–1556, Jun. 2021, doi: 10.1016/j.asej.2020.11.011.
R. Soelistijadi, T. Dwiati Wismarini, S. Eniyati, and S. I4, “Pemodelan Prediktif Menggunakan Metode Ensemble Learning XGBoost dalam Peningkatan Akurasi Klasifikasi Penyakit Ginjal,” KESATRIA Jurnal Penerapan Sistem Informasi (Komputer & Manajemen), vol. 5, no. 4, pp. 1866–1875, 2024, doi: https://doi.org/10.30645/kesatria.v5i4.507.
H. Apriyani, “Perbandingan Metode Naïve Bayes Dan Support Vector Machine Dalam Klasifikasi Penyakit Diabetes Melitus,” Journal of Information Technology Ampera, vol. 1, no. 3, pp. 2774–2121, 2020, doi: 10.51519/journalita.volume1.isssue3.year2020.page133-143.
T. A. Y. Siswa and N. A. Verdikha, “Komparasi Algoritma Klasifikasi Untuk Menentukan Evaluasi Kinerja Terbaik Pada Status Akreditasi Sekolah/Madrasah Kalimantan Timur Berdasarkan IASP 2020,” JINTEKS (Jurnal Informatika Teknologi dan Sains), vol. Vol. 4 No. 3, pp. 185–192, Aug. 2022, doi: 10.51401/jinteks.v4i3.1807.
N. Suarna and W. Prihartono, “Analisis Sentimen Ulasan Aplikasi Threads Di Google Playstore Menggunakan Algoritma Naïve Bayes,” Jurnal Mahasiswa Teknik Informatika, vol. 8, no. 1, 2024, doi: https://doi.org/10.29100/jipi.v9i1.4929.
I. Habib Kusuma and N. Cahyono, “Analisis Sentimen Masyarakat Terhadap Penggunaan E-Commerce Menggunakan Algoritma K-Nearest Neighbor,” Jurnal Informatika Jurnal pengembangan IT (JPIT), vol. 8, no. 3, 2023, doi: 10.30591/jpit.v8i3.5734.
C. N. Daiman, A. Y. Rahman, and F. Nudiyansyah, “Klasifikasi Teks Berita Breaking News Di Manggarai Menggunakan Long Short Term Memory (LSTM),” Jurnal MNEMONIC, vol. 7, no. 2, 2024, doi: 10.36040/mnemonic.v7i2.9939.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Klasifikasi Churn Dengan Algoritma Xgboost Menggunakan Feature Selection Boruta-Shap
Pages: 1193-1201
Copyright (c) 2025 Dwi Wahyu Kuncoro Hadi Sakaro, Puspita Nurul Shabrina, Edvin Ramadhan

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















