Segmentasi Pelanggan E-Commerce Berbasis Integrasi Text Mining dan RFM untuk Deteksi Dini Churn


  • Violin Juneyla Nandita Universitas Sriwijaya, Palembang, Indonesia
  • Juseia Wulandari * Mail Universitas Sriwijaya, Palembang, Indonesia
  • Apriyadi Apriyadi Universitas Sriwijaya, Palembang, Indonesia
  • Ali Ibrahim Universitas Sriwijaya, Palembang, Indonesia
  • Fathoni Fathoni Universitas Sriwijaya, Palembang, Indonesia
  • (*) Corresponding Author
Keywords: CRM Analytics; Text Mining; Latent Dirichlet Allocation; Random Forest; Sentiment Analysis

Abstract

The growth of transactions on e-commerce platforms generates a massive volume of unstructured customer review data. However, traditional Customer Relationship Management (CRM) models such as RFM often only focus on quantitative transaction data and ignore the emotional dimension contained in customer reviews. This study aims to analyze the relationship between purchase frequency and customer comment polarity through the integration of Text Mining and CRM Analytics approaches. The novelty offered is the development of a hybrid method that combines Lexicon Refinement-based sentiment extraction with the Random Forest algorithm to overcome rating bias in global e-commerce platform data (Kaggle). The proposed method includes the use of Natural Language Processing (NLP) techniques, topic modeling based on Latent Dirichlet Allocation (LDA), and sentiment analysis to extract polarity scores. The test results show that the initial lexicon model has limitations with an accuracy of 52.14% due to noise in neutral reviews (3-star rating). However, after optimization using the Random Forest algorithm and neutral data filtering, the classification accuracy increased significantly to 74.62%. These results prove that sentiment integration is able to provide more accurate loyalty mapping and help e-commerce management detect potential churn in the At-Risk customer segment.

Downloads

Download data is not yet available.

References

N. Sikana, S. Winardi, G. -, G. F. Situmorang, and R. Lubis, “Analisis Sentimen untuk Ulasan Produk E-Commerce Shopee Menggunakan BERT,” J. Sifo Mikroskil, vol. 26, no. 2, pp. 223–238, 2025, doi: 10.55601/jsm.v26i2.1796.

Z. Wen, Y. Chen, H. Liu, and Z. Liang, “Text Mining Based Approach for Customer Sentiment and Product Competitiveness Using Composite Online Review Data,” J. Theor. Appl. Electron. Commer. Res., vol. 19, no. 3, pp. 1776–1792, Jul. 2024, doi: 10.3390/jtaer19030087.

M. Zhang, J. Cui, and Y. Sun, “Association Between Sleep Duration , Screen-Based Sedentary Time , and Weight Status Among Chinese Adolescents,” Healthcare, pp. 4–13, 2025, doi: https://doi.org/10.3390/healthcare13243237.

X. Ma and Z. Wang, “Research on the Relationship between Lifestyle and Sleep Health,” Highlights Sci. Eng. Technol., vol. 94, pp. 379–385, 2024.

H. Pango, “Sentiment Analysis in e-commerce : Developing a Model using Natural Language Processing,” TU Wien, Vienna, 2025. doi: 10.34726/hss.2025.113062.

T. F. Abdillah, H. Hasmawati, and B. Bunyamin, “Comparison of TF-IDF and GloVe Word Embedding for Sentiment Analysis of 2024 Presidential Candidates,” Build. Informatics, Technol. Sci., vol. 6, no. 2, pp. 961–969, Sep. 2024, doi: 10.47065/bits.v6i2.5668.

E. Fitri, Y. Yuliani, S. Rosyida, and W. Gata, “Analisis Sentimen Terhadap Aplikasi Ruangguru Menggunakan Algoritma Naive Bayes , Random Forest Dan Support Vector Machine,” vol. 18, no. 1, pp. 71–80, 2020.

H. Akbar, D. Aryani, M. K. Mohammed Al-shammari, and M. B. Ulum, “Sentiment Analysis for E-Commerce Product Reviews Based on Feature Fusion and Bidirectional Long Short-Term Memory,” J. Tek. Inform., vol. 5, no. 5, pp. 1385–1391, Oct. 2024, doi: 10.52436/1.jutif.2024.5.5.2675.

Y. A. Singgalen, “A Hybrid CNN-LSTM Model with SMOTE for Enhanced Sentiment Analysis of Hotel Reviews,” Build. Informatics, Technol. Sci., vol. 6, no. 3, pp. 1363–1373, Dec. 2024, doi: 10.47065/bits.v6i3.6301.

A. R. Kurniawan Maranto, Liliy Damayanti, and Irvan Rahul Ramadika, “Perbandingan Algoritma C4.5 dengan Naïve Bayes untuk Menduga Loyalitas Pelanggan pada Perusahaan Internet Service Provider.,” bit-Tech, vol. 7, no. 2, pp. 396–405, 2024, doi: 10.32877/bt.v7i2.1825.

H. Safitri, S. Putri Lenggo Geni, F. Merry, and M. Wati, “Penerapan K-Means Clustering untuk Segmentasi Konsumen E-Commerce Penerapan K-Means Clustering untuk Segmentasi Konsumen E-Commerce Berdasarkan Pola Pembelian,” JUKI J. Komput. dan Inform., vol. 7, no. 1, pp. 89–99, 2025.

X. Ma, Y. Li, and M. Asif, “E-Commerce Review Sentiment Analysis and Purchase Intention Prediction Based on Deep Learning Technology,” J. Organ. End User Comput., vol. 36, no. 1, pp. 1–29, Dec. 2023, doi: 10.4018/JOEUC.335122.

F. R. Ridho, Y. Sibaroni, and D. Puspandari, “Multi-Aspect Sentiment Analysis Using Elman Recurrent Neural Network (ERNN) Method for TripAdvisor App User Reviews,” Build. Informatics, Technol. Sci., vol. 6, no. 2, pp. 1034–1044, Sep. 2024, doi: 10.47065/bits.v6i2.5746.

B. Yanuargi, Ema Utami, Kusrini, and A. A. Parikesit, “Data Clustering for Sentiment Classification with Naïve Bayes and Support Vector Machine,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 8, no. 6, pp. 819–827, Dec. 2024, doi: 10.29207/resti.v8i6.6139.

S. Gholamveisy et al., “Application of data mining technique for customer purchase behavior via Extended RFM model with focus on BCG matrix from a data set of online retailing,” J. Infrastruct. Policy Dev., vol. 8, no. 7, p. 4426, Jul. 2024, doi: 10.24294/jipd.v8i7.4426.

C. Ding and X. Ma, “A novel comprehensive method for customer segmentation based on identifying topics and sentiments from unstructured online product reviews,” Big Data Inf. Anal., vol. 10, pp. 1–28, 2026, doi: 10.3934/bdia.2026001.

S. W. Dyatmika, B. Suyanto, E. Setijaningrum, A. Rizky, and T. Mkhize, “Enhancing Brand Loyalty through Customer Satisfaction Strategies in Digital Business,” Aptisi Trans. Technopreneursh., vol. 7, no. 2, Jul. 2025, doi: 10.34306/att.v7i2.558.

I. Kurniawan, A. L. Hananto, S. S. Hilabi, A. Hananto, B. Priyatna, and A. Y. Rahman, “Perbandingan Algoritma Naive Bayes Dan SVM Dalam Sentimen Analisis Marketplace Pada Twitter,” vol. 10, no. 1, pp. 731–740, 2023.

Y. A. Singgalen, “Implementation of the GloVe in Topic Analysis based on Vader and TextBlob Sentiment Classification,” Build. Informatics, Technol. Sci., vol. 5, no. 4, Mar. 2024, doi: 10.47065/bits.v5i4.5033.

I. Arief, M. Farhandika, A. S. Indrapriyatna, A. A. Yulianto, and Y. Meuthia, “Enhancing User Interface and Experience of the Bukalapak Application: A Sentiment Analysis Approach for Improved Usability and User Satisfaction in Indonesia’s E-Commerce Sector,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 7, no. 5, pp. 1192–1204, Oct. 2023, doi: 10.29207/resti.v7i5.5184.

C.-G. Wong, G.-K. Tong, and S.-C. Haw, “Exploring Customer Segmentation in E-Commerce using RFM Analysis with Clustering Techniques,” J. Telecommun. Digit. Econ., vol. 12, no. 3, pp. 97–125, Sep. 2024, doi: 10.18080/jtde.v12n3.978.

C. Liu, T. Chen, Q. Pu, and Y. Jin, “Text Mining for Consumers’ Sentiment Tendency and Strategies for Promoting Cross-Border E-Commerce Marketing Using Consumers’ Online Review Data,” J. Theor. Appl. Electron. Commer. Res., vol. 20, no. 2, p. 125, Jun. 2025, doi: 10.3390/jtaer20020125.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Segmentasi Pelanggan E-Commerce Berbasis Integrasi Text Mining dan RFM untuk Deteksi Dini Churn

Dimensions Badge
Article History
Submitted: 2026-04-17
Published: 2026-06-05
Abstract View: 0 times
PDF Download: 0 times
How to Cite
Nandita, V., Wulandari, J., Apriyadi, A., Ibrahim, A., & Fathoni, F. (2026). Segmentasi Pelanggan E-Commerce Berbasis Integrasi Text Mining dan RFM untuk Deteksi Dini Churn. Building of Informatics, Technology and Science (BITS), 8(1), 228-235. https://doi.org/10.47065/bits.v8i1.9687
Issue
Section
Articles

Most read articles by the same author(s)