Implementasi Algoritma K-Means Clustering untuk Pengelompokan Produk E-Commerce Berdasarkan Harga, Diskon, dan Total Revenue


  • Rinaldi Pasaribu * Mail Universitas Budi Darma, Medan, Indonesia
  • Saidi Ramadan Siregar Universitas Budi Darma, Medan, Indonesia
  • (*) Corresponding Author
Keywords: K-Means Clustering; E-Commerce; Product Segmentation; Business Intelligence; Elbow Method; Silhouette Score

Abstract

The rapid growth of e-commerce has generated a large volume of transactional data; however, this data has not been fully utilized to support strategic decision-making, particularly in product segmentation. The main problem addressed in this study is the absence of a systematic product grouping approach based on key attributes such as price, discount, and revenue, which leads to less effective pricing and promotional strategies. Therefore, this study aims to analyze product sales patterns and cluster e-commerce products based on the characteristics of price, discount_percent, and total_revenue. The dataset used is an Amazon-style e-commerce dataset consisting of 50,000 transaction records and 13 attributes, with the analysis focusing on the three main attributes as the basis for clustering. The method applied in this research is K-Means Clustering, which involves data preprocessing, normalization using Min-Max Scaling, and determining the optimal number of clusters using the Elbow Method and Silhouette Score. The results indicate that the optimal number of clusters is three clusters, supported by the highest Silhouette Score of 0.354 and a clear elbow pattern in the Elbow graph. Additional evaluation using the Davies-Bouldin Index of 0.9335 indicates that the clustering quality is fairly good, although not yet optimal. The clustering results produce three main groups: premium product cluster (high price, low discount, high revenue), discount product cluster (moderate price, high discount, moderate revenue), and low-performance product cluster (low price, low discount, low revenue). In conclusion, the K-Means algorithm is capable of effectively clustering e-commerce products based on relevant numerical attributes and generating insights that can support business strategies such as pricing and promotional decisions.

Downloads

Download data is not yet available.

References

A’yun, I. Q., Anggraini, L., Asmara, G. D., & Khoirunnisa, R. M. (2024). Analysis of the Development of E-Commerce Transactions in the 6 Highest Transaction Countries in Southeast Asia. Journal of Economics Research and Social Sciences, 8(2), 207–221. https://doi.org/10.18196/jerss.v8i2.22033

Allorerung, P. P., Erna, A., Bagussahrir, M., & Alam, S. (2024). Analisis Performa Normalisasi Data untuk Klasifikasi K-Nearest Neighbor pada Dataset Penyakit. JISKA (Jurnal Informatika Sunan Kalijaga), 9(3), 178–191. https://doi.org/10.14421/jiska.2024.9.3.178-191

Almaripat, M., Faqih, A., & Rinaldy, A. R. (2025). Sales Data Classterization Analysis Using K-Means Method for Marketing Strategy Development. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 4(2), 972–976. https://doi.org/10.59934/jaiea.v4i2.792

Amalina, T., Pramana, D. B. A., & Sari, B. N. (2022). Metode K-Means clustering dalam pengelompokan penjualan produk frozen food. Jurnal Ilmiah Wahana Pendidikan, 8(15), 574–583. https://doi.org/10.5281/zenodo.7052276

Apriyanto, B., & Sitio, S. L. M. (2025). Penerapan k-means dalam menganalisis pola pembelian pelanggan pada data transaksi e-commerce. Bit-Tech, 7(3), 790–797. https://doi.org/10.32877/bt.v7i3.2195

Azzahra, L., & Yasir, A. (2024). A Metode K-Means Clustering Dalam Pengelompokan Penjualan Produk Frozen Food. Jurnal Ilmu Komputer Dan Sistem Informasi, 3(1), 1–10. https://doi.org/10.70340/jirsi.v3i1.88

Caroline, Yuswardi, & Rofi’i, Y. U. (2023). Analysis of e-commerce purchase patterns using big data: An integrative approach to understanding consumer behavior. International Journal Software Engineering and Computer Science (IJSECS), 3(3), 352–364. https://doi.org/10.35870/ijsecs.v3i3.1840

Effendy, I. R., Manurung, Y. A., & Hujaifah, M. I. (2026). Implementasi Algoritma K-means Clustering untuk Mencari Preferensi Pelanggan Toko Online Tazeee Clothes. Jurnal Media Informatika, 7(1), 69–76. https://doi.org/10.55338/jumin.v7i1.8054

Faizi, M. I., & Adnan, S. M. (2024). Improved segmentation model for melanoma lesion detection using normalized cross-correlation-based k-means clustering. IEEE Access, 12, 20753–20766. https://doi.org/10.1109/ACCESS.2024.3360223

Falih, A. R. F., Kurniawan, R., Wijaya, Y. A., & Anwar, S. (2025). Algoritma K-Mean Untuk Optimalisasi Model Clustering Data Penjualan Toko Online Di Tiktok Shop Dalam Strategi Pemasaran. Jurnal Sistem Informasi Kaputama (JSIK), 9(1), 1–11. https://doi.org/10.59697/jsik.v9i1.929

Gustriansyah, R., Alie, J., & Suhandi, N. (2024). A hybrid machine learning model for market clustering. Engineering, Technology & Applied Science Research, 14(6), 18824–18828. https://doi.org/10.48084/etasr.9259

Gwak, G., Hwang, U., & Kim, J. (2025). Clustering of shoulder movement patterns using K-means algorithm based on the shoulder range of motion. Journal of Bodywork and Movement Therapies, 41, 164–170. https://doi.org/https://doi.org/10.1016/j.jbmt.2024.11.034

Heidari, J., Daneshpour, N., & Zangeneh, A. (2024). A novel K-means and K-medoids algorithms for clustering non-spherical-shape clusters non-sensitive to outliers. Pattern Recognition, 155, 1–12. https://doi.org/https://doi.org/10.1016/j.patcog.2024.110639

Hendra, H., Hermawan, A., & Edy, E. (2024). Smart product recommendations in web e-commerce: leveraging apriori algorithm for market basket analysis. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 18(3). https://doi.org/10.22146/ijccs.89075

Johan, M. E. (2025). Implementation of Customer Segmentation Model using K-Means and DBSCAN for Fashion Industry Product Transaction. JOIV: International Journal on Informatics Visualization, 9(6), 2559–2568. https://doi.org/10.62527/joiv.9.6.2978

Kannan, M. K. J., & Khan, A. (2025). Predictive Analysis and Data-Driven Strategies for Turning Data into Dollars to Visualize ROI Using Retail Intelligence 2.0. International Journal for Multidisciplinary Research (IJFMR), 7(2), 1–17. https://doi.org/10.36948/ijfmr.2025.v07i02.40721

Kusnaidi, M. R., Gulo, T., & Aripin, S. (2022). Penerapan Normalisasi Data Dalam Mengelompokkan Data Mahasiswa Dengan Menggunakan Metode K-Means Untuk Menentukan Prioritas Bantuan Uang Kuliah Tunggal. Journal of Computer System and Informatics (JoSYC), 3(4), 330–338. https://doi.org/10.47065/josyc.v3i4.2112

Mado, P. M. K., & Hendry, H. (2025). Implementasi algoritma clustering K-Means untuk segmentasi pelanggan di e-commerce. Jurnal Indonesia: Manajemen Informatika Dan Komunikasi, 6(3), 1680–1686. https://doi.org/10.63447/jimik.v6i3.1563

Manarung, R. I., Widodo, E., & Rifai, A. M. (2025). Sales Data Clustering Using the K-Means Algorithm to Determine Retail Product Needs. International Journal Software Engineering and Computer Science (IJSECS), 5(1), 226–234. https://doi.org/10.35870/ijsecs.v5i1.4090

Minh, H.-L., Sang-To, T., Wahab, M. A., & Cuong-Le, T. (2022). A new metaheuristic optimization based on K-means clustering algorithm and its application to structural damage identification. Knowledge-Based Systems, 251, 109189. https://doi.org/https://doi.org/10.1016/j.knosys.2022.109189

Moodley, R., Chiclana, F., Caraffini, F., & Carter, J. (2020). A product-centric data mining algorithm for targeted promotions. Journal of Retailing and Consumer Services, 54, 101940. https://doi.org/10.1016/j.jretconser.2019.101940

Munshi, A., Alhindi, A., Qadah, T. M., & Alqurashi, A. (2023). An Electronic Commerce Big Data Analytics Architecture and Platform. In Applied Sciences (Vol. 13, Issue 19, p. 10962). https://doi.org/10.3390/app131910962

Noval, M., Windarsyah, W., & Marleny, F. D. (2025). Implementasi Algoritma K-Means Untuk Analisis Pola Penjualan Pada Toko Monisa. Jurnal Media Informatika, 6(3), 1996–2002. https://doi.org/10.55338/jumin.v6i3.6237

Pandiangan, D. F., & Albina, M. (2025). Model dan Tahapan Penelitian Kuantitatif: Pendekatan Teoretis dan Praktis dalam Kajian Pendidikan. IHSAN: Jurnal Pendidikan Islam, 3(3), 724–730. https://doi.org/10.61104/ihsan.v3i3.1494

Praditya, R. G., Sembodo, G., & Heikal, J. (2024). Market segmentation analysis to find out products and services that suit customer needs using the python Kmeans clustering method (Case study: Superindo Tambun Area, Bekasi). Jurnal Teknik Industri Terintegrasi, 7(4), 2072–2081. https://doi.org/10.31004/jutin.v7i4.35889

Putri, Y., Aldo, D., & Ilham, W. (2024). Retail Marketing Strategy Optimization: Customer Segmentation with Artificial Intelligence Integration and K-Means Clustering. SINKRON: Jurnal Dan Penelitian Teknik Informatika, 8(1), 20–28. https://doi.org/10.33395/sinkron.v8i4.14000

Qu, Y. (2025). Research on Purchasing Behavior Pattern of E-commerce Platform Consumers Based on Big Data Analysis. Advances in Economics, Management and Political Sciences, 177, 187–191. https://doi.org/10.54254/2754-1169/2025.22481

Sakinah, A., & Awaliyah, D. S. (2025). Optimization E-Commerce Consumer Segmentation Based On K-Means Clustering And Machine Learning. Journal of Mathematics, Computations and Statistics, 8(2), 606–619. https://doi.org/10.35580/jmathcos.v8i2.9548

Salman, Z., & Alomary, A. (2024). Performance of the K-means and fuzzy C-means algorithms in big data analytics. International Journal of Information Technology, 16(1), 465–470. https://doi.org/https://doi.org/10.1007/s41870-023-01436-y

Shen, W. (2025). Analysis of E-commerce Customer Consumption Data and Traffic Risk Detection Based on User Recommendation Algorithm. Results in Engineering, 27, 106255. https://doi.org/10.1016/j.rineng.2025.106255

Sholeh, M., La’i, H. N., Arif, R., Fahtezi, N. A., & Darojah, Z. (2025). Pengelolaan Arsip yang Efektif Terhadap Akurasi Data dan Pengambilan Keputusan di SDN Siwalankerto I. Jurnal Administrasi Pendidikan Islam, 7(1), 85–99. https://doi.org/10.15642/japi.2025.7.1.85-99

Siregar, M. N. H., & Khalidy, F. (2024). Utilization of Sales Data Analysis for Product Recommendation Systems in E-Commerce Using the Apriori Algorithm. Journal of Computer Science Artificial Intelligence and Communications, 1(2), 41–45. https://doi.org/10.64803/jocsaic.v1i2.17

Suh, Y. (2025). Discovering customer segments through interaction behaviors for home appliance business. Journal of Big Data, 12(57), 1–39. https://doi.org/10.1186/s40537-025-01111-y

Suhairi, S., Siregar, M. M., Ningrum, L. D., Bintang, R., & Mutiara, A. (2023). Strategi segmentasi, targeting, dan positioning dalam pasar global: Pendekatan untuk keberhasilan bisnis internasional. Innovative: Journal Of Social Science Research, 3(6), 5120–5131. https://j-innovative.org/index.php/Innovative/article/view/6606

Sun, C. (2024). Data Analysis of Customer Segmentation and Personalized Strategy in the Era of Big Data. Advances in Economics, Management and Political Sciences, 92, 46–52. https://doi.org/10.54254/2754-1169/92/20231411

Tjia, T. E., Yasir, F. N., & Ekawati, S. (2025). Implementation of Data Mining for Analyzing Consumer Purchasing Patterns at TeTa Ino Café. Brilliance: Research of Artificial Intelligence, 5(2), 751–760. https://doi.org/10.47709/brilliance.v5i2.6767

Wan, B., Huang, W., Pierre, B., Cheng, Y., & Zhou, S. (2024). K-Means algorithm based on multi-feature-induced order. Granular Computing, 9(2), 45. https://doi.org/https://doi.org/10.1007/s41066-024-00470-w

Wu, R. (2024). Behavioral analysis of electricity consumption characteristics for customer groups using the k-means algorithm. Systems and Soft Computing, 6, 200143. https://doi.org/https://doi.org/10.1016/j.sasc.2024.200143

Yoseph, F., Ahamed Hassain Malim, N. H., Heikkilä, M., Brezulianu, A., Geman, O., & Paskhal Rostam, N. A. (2020). The impact of big data market segmentation using data mining and clustering techniques. Journal of Intelligent & Fuzzy Systems, 38(5), 6159–6173. https://doi.org/10.3233/JIFS-179698

Yuliarnis, S. K., Hendriyani, Y., Kurniadi, D., & Giatman, M. (2020). Application of data mining for analysis of consumer purchase data on sales transaction data at halal mart hni hpai dharmasraya. Jurnal Pendidikan Teknologi Kejuruan, 3(1), 68–75. https://doi.org/10.24036/jptk.v3i1.6923

Zulkurnain, D. R., & Eryanto, H. (2025). Analisis Sistem Pengelolaan Arsip Inaktif Pada Lembaga Sosial XYZ. Musytari: Jurnal Manajemen, Akuntansi, Dan Ekonomi, 22(10), 141–150. https://doi.org/10.2324/jy3q6085


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Implementasi Algoritma K-Means Clustering untuk Pengelompokan Produk E-Commerce Berdasarkan Harga, Diskon, dan Total Revenue

Dimensions Badge
Article History
Published: 2026-05-18
Abstract View: 39 times
PDF Download: 26 times
Issue
Section
Articles