Optimizing Insurance Customer Segmentation with C4.5 Decision Tree Algorithm


  • Sigit Candra Setya Institut Teknologi Pagar Alam, Pagar Alam, Indonesia
  • Moch. Iswan Perangin-angin Universitas Budi Darma, Medan, Indonesia
  • Marsono Marsono STMIK Triguna Dharma, Medan, Indonesia
  • Asyahri Hadi Nasyuha * Mail Universitas Teknologi Digital Indonesia, Yogyakarta, Indonesia
  • Lucia Nugraheni Harnaningrum Universitas Teknologi Digital Indonesia, Yogyakarta, Indonesia
  • (*) Corresponding Author
Keywords: Data Mining; Decision Tree; C4.5 Algorithm; Insurance Risk Classification; Customer Segmentation

Abstract

Insurance companies rely on premium payments as their primary source of revenue. However, economic instability often causes delays in premium payments, impacting revenue recording. This study applies the C4.5 Decision Tree algorithm to classify insurance customers based on premium amount, age, income, and claim history, thereby improving product recommendations. The research utilizes data mining techniques to analyze customer attributes and generate decision rules for optimal insurance product selection. The findings indicate that customers with a premium of IDR 500,000 are best suited for PRUMed Cover (PMC), while those with IDR 1,000,000 are recommended PRUCritical Benefit 88 (PCB88). For customers with IDR 750,000, additional factors such as age and income level influence the recommended insurance type. The entropy and information gain calculations identify premium amount as the most significant attribute for decision-making, followed by age, income, and claim history. By implementing this method, insurance companies can enhance customer segmentation, streamline product selection, and optimize marketing strategies. The transparent and interpretable decision tree structure ensures regulatory compliance while improving customer satisfaction. Future research should explore additional variables, such as behavioral data and regional trends, and compare C4.5 with other classification algorithms like Random Forest or Support Vector Machines (SVM) to enhance accuracy and scalability.

Downloads

Download data is not yet available.

References

K. Sandner, S. Sieber, M. Tellermann, and F. Walthes, A Lean Six Sigma framework for the insurance industry: insights and lessons learned from a case study, vol. 90, no. 5–6. 2020.

S. Gupta, P. Lehmann, S. Bonetti, A. Papritz, and D. Or, “Global Prediction of Soil Saturated Hydraulic Conductivity Using Random Forest in a Covariate‐Based GeoTransfer Function (CoGTF) Framework,” J. Adv. Model. Earth Syst., vol. 13, no. 4, pp. 1–7, Apr. 2021, doi: 10.1029/2020MS002242.

Y. Han et al., “Leverage Classifier: Another Look at Support Vector Machine,” Stat. Sin., vol. 23, no. 2, p. 242752, 2025, doi: 10.5705/ss.202023.0124.

K. Dutta, S. Chandra, M. K. Gourisaria, and H. Gm, “A Data Mining based Target Regression-Oriented Approach to Modelling of Health Insurance Claims,” Proc. - 5th Int. Conf. Comput. Methodol. Commun. ICCMC 2021, no. Iccmc, pp. 1168–1175, 2021, doi: 10.1109/ICCMC51019.2021.9418038.

S. Chanmee and K. Kesorn, “Semantic decision Trees: A new learning system for the ID3-Based algorithm using a knowledge base,” Adv. Eng. Informatics, vol. 58, no. 2, p. 102156, Oct. 2023, doi: 10.1016/j.aei.2023.102156.

A. H. Nasyuha, Z. Zulham, and I. Rusydi, “Implementation of K-means algorithm in data analysis,” TELKOMNIKA (Telecommunication Comput. Electron. Control., vol. 20, no. 2, p. 307, Apr. 2022, doi: 10.12928/telkomnika.v20i2.21986.

M. Rambabu, S. Gupta, and R. S. Singh, “Data Mining in Cloud Computing: Survey,” 2021, pp. 48–56.

A. M. Sarhan, “Data Mining in Internet of Things Systems: A Literature Review,” J. Eng. Res., vol. 6, no. 5, pp. 252–263, 2023.

A. H. Nasyuha et al., “Frequent pattern growth algorithm for maximizing display items,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 19, no. 2, pp. 390–396, 2021, doi: 10.12928/TELKOMNIKA.v19i2.16192.

S. Wang, J. Cao, and P. S. Yu, “Deep Learning for Spatio-Temporal Data Mining: A Survey,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 8, pp. 3681–3700, Aug. 2022, doi: 10.1109/TKDE.2020.3025580.

S. M. Dol and P. M. Jawandhiya, “Classification Technique and its Combination with Clustering and Association Rule Mining in Educational Data Mining—A survey,” Eng. Appl. Artif. Intell., vol. 122, p. 106071, 2023.

M. A. Baherifard, R. Kazemzadeh, A. S. Yazdankhah, and M. Marzband, “Improving the Effect of Electric Vehicle Charging on Imbalance Index in the‎ Unbalanced Distribution Network Using Demand Response Considering Data‎ Mining Techniques,” J. Oper. Autom. Power Eng., vol. 11, no. 3, pp. 182–192, 2023.

Dr. T. Senthil Kumar, “Data Mining Based Marketing Decision Support System Using Hybrid Machine Learning Algorithm,” J. Artif. Intell. Capsul. Networks, vol. 2, no. 3, pp. 185–193, Aug. 2020, doi: 10.36548//jaicn.2020.3.006.

M. M. Ghiasi and S. Zendehboudi, “Application of decision tree-based ensemble learning in the classification of breast cancer,” Comput. Biol. Med., vol. 128, p. 104089, Jan. 2021, doi: 10.1016/j.compbiomed.2020.104089.

E. Alyahyan and D. Dusteaor, “Decision Trees for Very Early Prediction of Student’s Achievement,” in 2020 2nd International Conference on Computer and Information Sciences (ICCIS), 2020, vol. 12, no. 2, pp. 1–7, doi: 10.1109/ICCIS49240.2020.9257646.

F. Nie, W. Zhu, and X. Li, “Decision Tree SVM: An extension of linear SVM for non-linear classification,” Neurocomputing, vol. 401, pp. 153–159, 2020, doi: 10.1016/j.neucom.2019.10.051.

L. Vanfretti and V. S. N. Arava, “Decision tree-based classification of multiple operating conditions for power system voltage stability assessment,” Int. J. Electr. Power Energy Syst., vol. 123, no. October 2019, 2020, doi: 10.1016/j.ijepes.2020.106251.

S. R. Jiao, J. Song, and B. Liu, “A Review of Decision Tree Classification Algorithms for Continuous Variables,” J. Phys. Conf. Ser., vol. 1651, no. 1, 2020, doi: 10.1088/1742-6596/1651/1/012083.

Z. Sun, G. Wang, P. Li, H. Wang, M. Zhang, and X. Liang, “An improved random forest based on the classification accuracy and correlation measurement of decision trees,” Expert Syst. Appl., vol. 237, no. 1, p. 121549, Mar. 2024, doi: 10.1016/j.eswa.2023.121549.

S. Tangirala, “Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 2, pp. 612–619, 2020, doi: 10.14569/ijacsa.2020.0110277.

N. Yuvaraj et al., “Automatic detection of cyberbullying using multi-feature based artificial intelligence with deep decision tree classification,” Comput. Electr. Eng., vol. 92, no. April, 2021, doi: 10.1016/j.compeleceng.2021.107186.

A. A. Dehghani, N. Movahedi, K. Ghorbani, and S. Eslamian, “Decision tree algorithms,” in Handbook of Hydroinformatics, vol. 5, no. 2, Elsevier, 2023, pp. 171–187.

K. Maswadi, N. A. Ghani, S. Hamid, and M. B. Rasheed, “Human activity classification using Decision Tree and Naïve Bayes classifiers,” Multimed. Tools Appl., vol. 80, no. 14, pp. 21709–21726, 2021, doi: 10.1007/s11042-020-10447-x.

S. Lee, C. Lee, K. G. Mun, and D. Kim, “Decision Tree Algorithm Considering Distances between Classes,” IEEE Access, vol. 10, no. June, pp. 69750–69756, 2022, doi: 10.1109/ACCESS.2022.3187172.

A. Z. Abdullah, B. Winarno, and D. R. S. Saputro, “The decision tree classification with C4.5 and C5.0 algorithm based on R to detect case fatality rate of dengue hemorrhagic fever in Indonesia,” J. Phys. Conf. Ser., vol. 1776, no. 1, pp. 0–10, 2021, doi: 10.1088/1742-6596/1776/1/012040.

H. Mardiansyah, M. Zarlis, and O. S. Sitompul, “Analysis of C4.5 Algorithm of Water Quality Dataset,” J. Phys. Conf. Ser., vol. 1898, no. 1, 2021, doi: 10.1088/1742-6596/1898/1/012002.

M. Heydari, P. Pahlavani, and B. Bigdeli, “Modern Geomatics Technologies and Applications 1 Comparison of CART and C4 . 5 decision tree algorithms for classification of particulate matter pollution Modern Geomatics Technologies and Applications Comparison of CART and C4 . 5 decision tree algorithm,” no. April 2024, pp. 1–9, 2020.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Optimizing Insurance Customer Segmentation with C4.5 Decision Tree Algorithm

Dimensions Badge
Article History
Submitted: 2025-05-15
Published: 2025-07-20
Abstract View: 552 times
PDF Download: 413 times
How to Cite
Setya, S. C., Perangin-angin, M. I., Marsono, M., Nasyuha, A. H., & Harnaningrum, L. N. (2025). Optimizing Insurance Customer Segmentation with C4.5 Decision Tree Algorithm. Journal of Information System Research (JOSH), 6(4), 1938-1947. https://doi.org/10.47065/josh.v6i4.7358
Issue
Section
Articles