Segmentasi Pelanggan Menggunakan K-Means Clustering Pada Data Transaksi Online Retail
Abstract
The telecommunications industry faces challenges in understanding customer characteristics due to large data volumes and diverse service usage behaviors. Customer segmentation becomes a strategic approach to support more targeted and effective marketing strategies. This study aims to apply the K-Means clustering algorithm to perform data-driven customer segmentation in a telecommunications company. The customer dataset undergoes preprocessing stages including missing value handling, categorical variable encoding using One-Hot Encoding, and feature scaling with StandardScaler. The optimal number of clusters is determined using the Elbow Method. The results show the formation of three customer segments with distinct characteristics based on tenure, monthly charges, total charges, and churn rate. Visualization using Principal Component Analysis (PCA) clearly illustrates the separation among clusters. An interesting finding reveals that the segment with the highest customer value also has the highest churn rate, indicating the need for more specific retention strategies. The contribution of this study lies in providing an unsupervised learning-based customer segmentation approach that can assist companies in designing more effective marketing strategies, improving customer retention, and supporting data-driven decision making.
Downloads
References
E. F. L. Awalina and W. I. Rahayu, “Optimalisasi Strategi Pemasaran dengan Segmentasi Pelanggan Menggunakan Penerapan K-Means Clustering pada Transaksi Online Retail,” J. Teknol. Dan Inf., vol. 13, no. 2, pp. 122–137, 2023.
K. K. Tsiptsis and A. Chorianopoulos, Data Mining Techniques in CRM: Inside Customer Segmentation. John Wiley & Sons, 2011.
F. Leisch, S. Dolnicar, and B. Grün, Market Segmentation Analysis: Understanding It, Doing It, and Making It Useful. 2018. Accessed: Jan. 30, 2026. [Online]. Available: https://library.oapen.org/handle/20.500.12657/51281
A. Pramuditya, “Perancangan Sistem Analisis Sentimen Komentar Pelanggan Menggunakan Metode Naive Bayes Classifier,” J. Ilmu Data, vol. 2, no. 10, 2022, Accessed: Feb. 04, 2026. [Online]. Available: http://www.ilmudata.org/index.php/ilmudata/article/view/235
E. M. Sipayung, H. Maharani, and I. Zefanya, “Perancangan Sistem Analisis Sentimen Komentar Pelanggan Menggunakan Metode Naive Bayes Classifier,” J. Sist. Inf. JSI, vol. 8, no. 1, pp. 958–965, 2016.
“(PDF) Customer Segmentation Using K- Means Clustering Algorithm.” Accessed: Jan. 16, 2026. [Online]. Available: https://www.researchgate.net/publication/355587534_Customer_Segmentation_Using_K-_Means_Clustering_Algorithm
A. Ghosal, A. Nandy, A. K. Das, S. Goswami, and M. Panday, “A Short Review on Different Clustering Techniques and Their Applications,” in Emerging Technology in Modelling and Graphics, vol. 937, J. K. Mandal and D. Bhattacharya, Eds., in Advances in Intelligent Systems and Computing, vol. 937. , Singapore: Springer Singapore, 2020, pp. 69–83. doi: 10.1007/978-981-13-7403-6_9.
J. Han, J. Pei, and H. Tong, Data mining: concepts and techniques. Morgan kaufmann, 2022. Accessed: Jan. 15, 2026. [Online]. Available: https://books.google.com/books?hl=id&lr=&id=NR1oEAAAQBAJ&oi=fnd&pg=PP1&dq=%5B1%5D+J.+Han,+M.+Kamber,+and+J.+Pei,+Data+Mining:+Concepts+and+Techniques,+3rd+ed.+Morgan+Kaufmann,+2011.+doi:10.1016/C2009-0-61819-5&ots=_N8GRHpjr3&sig=a7jQCeig1APDYuP22XHdVqTA5qA
K. R. Shahapure and C. Nicholas, “Cluster Quality Analysis Using Silhouette Score,” in 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), sydney, Australia: IEEE, Oct. 2020, pp. 747–748. doi: 10.1109/DSAA49011.2020.00096.
B. Walek, “Fuzzy-Expert System for Customer Behavior Prediction,” in Artificial Intelligence and Algorithms in Intelligent Systems: Proceedings of 7th Computer Science On-line Conference, 2018, p. 122. Accessed: Jan. 16, 2026. [Online]. Available: https://books.google.com/books?hl=id&lr=&id=s5RdDwAAQBAJ&oi=fnd&pg=PA122&dq=Customer+behavior+analysis,%E2%80%9D+Expert+Systems+&ots=NT9TYDWRbu&sig=u5xzbzp8gTXcxsbaGIBpyz0iJ_Y
H. Abbasimehr and M. Shabani, “A new methodology for customer behavior analysis using time series clustering: A case study on a bank’s customers,” Kybernetes, vol. 50, no. 2, pp. 221–242, 2021.
A. Ghosal, A. Nandy, A. K. Das, S. Goswami, and M. Panday, “A Short Review on Different Clustering Techniques and Their Applications,” in Emerging Technology in Modelling and Graphics, vol. 937, J. K. Mandal and D. Bhattacharya, Eds., in Advances in Intelligent Systems and Computing, vol. 937. , Singapore: Springer Singapore, 2020, pp. 69–83. doi: 10.1007/978-981-13-7403-6_9.
K. Z. Wijaya, A. Djunaidi, and F. Mahananto, “Segmentasi Pelanggan Menggunakan Algoritma K-Means dan Analisis RFM di Ova Gaming E-Sports Arena Kediri,” J. Tek. ITS, vol. 10, no. 2, pp. A300–A237, 2021.
A. D. Savitri, F. A. Bachtiar, and N. Y. Setyawan, “Segmentasi Pelanggan Menggunakan Metode K-Means Clustering Berdasarkan Model RFM Pada Klinik Kecantikan (Studi Kasus: Belle Crown Malang),” J. Pengemb. Teknol. Inf. Dan Ilmu Komput., vol. 2, no. 9, pp. 2957–2966, 2018.
I. Ariati, R. N. Norsa, L. Akhsan, and J. Heikal, “Segmentasi Pelanggan Menggunakan K-Means Clustering Studi Kasus Pelanggan Uht Milk Greenfield,” Cerdika J. Ilm. Indones., vol. 3, no. 7, pp. 729–743, 2023.
H. Hairani, K. E. Saputro, and S. Fadli, “K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4. 5, SVM, dan naive Bayes,” J. Teknol. Dan Sist. Komput., vol. 8, no. 2, pp. 89–93, 2020.
S. García, J. Luengo, and F. Herrera, Data Preprocessing in Data Mining, vol. 72. in Intelligent Systems Reference Library, vol. 72. Cham: Springer International Publishing, 2015. doi: 10.1007/978-3-319-10247-4.
N. Yudistira, “Peran big data dan deep learning untuk menyelesaikan permasalahan secara komprehensif,” Expert, vol. 11, no. 2, pp. 78–89, 2021.
D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” Stanford, 2006. Accessed: Jan. 15, 2026. [Online]. Available: http://ilpubs.stanford.edu:8090/778/
I. T. Jolliffe and J. Cadima, “Principal component analysis: a review and recent developments,” Philos. Trans. R. Soc. Math. Phys. Eng. Sci., vol. 374, no. 2065, p. 20150202, 2016.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Segmentasi Pelanggan Menggunakan K-Means Clustering Pada Data Transaksi Online Retail
Pages: 28-33
Copyright (c) 2025 Robby Satria Darma, Lukman Sunardi, Asep Toyib

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).






















