Klasifikasi Sentimen Ulasan Produk pada Platform E-Commerce di Indonesia dengan Menggunakan Model Pre-Trained IndoBERT
Abstract
In today's digital era, sentiment analysis of product reviews on e-commerce platforms is becoming increasingly important, especially on Tokopedia, one of the largest marketplaces in Indonesia. Tokopedia provides facilities for users to leave reviews after making transactions, which play an important role in helping businesses understand customer perceptions of products. This research aims to classify the sentiment of product reviews on Tokopedia using the IndoBERT model and evaluate its performance compared to LSTM-based methods combined with FastText, Glove, and Word2Vec embedding. The LSTM-FastText model in previous research achieved the highest accuracy of 85.08%. In this study, the sentiment classification of product reviews on Tokopedia was carried out with a total of 5400 data and the sentiment classification process was divided into two categories, namely positive and negative, with the division of the dataset into three groups: training, validation, and testing. The contribution in this research is to explore the effectiveness of the IndoBERT model performance compared to previous methods that implement the LSTM model with FastText, Glove, and Word2Vec embedding. Based on the research results, the IndoBERT model achieved an accuracy of 97%, with the same F1-score value for both sentiment categories of 97%. Specifically designed with pre-training on a large Indonesian corpus, IndoBERT is able to understand the context of the text better than the LSTM model used in previous studies. This allows IndoBERT to produce higher accuracy, as it can understand product reviews in Indonesian more effectively.
Downloads
References
E. H. Muktafin, K. Kusrini, and E. T. Luthfi, “Analisis Sentimen pada Ulasan Pembelian Produk di Marketplace Shopee Menggunakan Pendekatan Natural Language Processing,” J. Eksplora Inform., vol. 10, no. 1, pp. 32–42, Sep. 2020, doi: 10.30864/EKSPLORA.V10I1.390.
R. Hidayansyah, A. Ahmad, and N. I. Nabila, “The impact of consumer reviews and ratings on purchase decisions on the Tokopedia marketplace in Indonesia,” Int. J. Econ. Business, Entrep., vol. 6, no. 2, pp. 140–152, Dec. 2023, doi: 10.23960/IJEBE.V6I2.264.
A. N. Rohman, R. Luviana Musyarofah, E. Utami, and S. Raharjo, “Natural Language Processing on Marketplace Product Review Sentiment Analysis,” 2020 2nd Int. Conf. Cybern. Intell. Syst. ICORIS 2020, Oct. 2020, doi: 10.1109/ICORIS50180.2020.9320827.
R. Hermansyah and R. Sarno, “Sentiment analysis about product and service evaluation of pt telekomunikasi Indonesia tbk from tweets using textblob, naive bayes & K-NN Method,” Proc. - 2020 Int. Semin. Appl. Technol. Inf. Commun. IT Challenges Sustain. Scalability, Secur. Age Digit. Disruption, iSemantic 2020, pp. 511–516, Sep. 2020, doi: 10.1109/ISEMANTIC50169.2020.9234238.
D. Sharma and M. Sabharwal, “Sentiment analysis for social media using SVM classifier of machine learning,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 9 Special Issue 4, pp. 39–47, Jul. 2019, doi: 10.35940/IJITEE.I1107.0789S419.
N. A. Dirfas, V. Rahmayanti, and S. Nastiti, “Perbandingan Kinerja Pre-Trained Word Embedding Terhadap Performa Klasifikasi Sentimen Ulasan Produk Tokopedia Dengan Long Short-Term Memory(LSTM),” Build. Informatics, Technol. Sci., vol. 6, no. 2, pp. 878−889-878−889, Sep. 2024, doi: 10.47065/BITS.V6I2.5634.
M. V. Koroteev, “BERT: A Review of Applications in Natural Language Processing and Understanding,” Mar. 2021, Accessed: Oct. 07, 2024. [Online]. Available: https://arxiv.org/abs/2103.11943v1
M. P. Geetha and D. Karthika Renuka, “Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model,” Int. J. Intell. Networks, vol. 2, pp. 64–69, Jan. 2021, doi: 10.1016/J.IJIN.2021.06.005.
N. N. Qomariyah, T. Sun, and D. Kazakov, “NLP Analysis of COVID-19 Radiology Reports in Indonesian using IndoBERT,” IBIOMED 2022 - Proc. 2022 4th Int. Conf. Biomed. Eng., pp. 65–70, 2022, doi: 10.1109/IBIOMED56408.2022.9988223.
R. Sutoyo, S. Achmad, A. Chowanda, E. W. Andangsari, and S. M. Isa, “PRDECT-ID: Indonesian product reviews dataset for emotions classification tasks,” Data Br., vol. 44, p. 108554, Oct. 2022, doi: 10.1016/J.DIB.2022.108554.
J. Liu et al., “Application of Deep Learning-Based Natural Language Processing in Multilingual Sentiment Analysis,” Mediterr. J. Basic Appl. Sci., vol. 08, no. 02, pp. 243–260, 2024, doi: 10.46382/MJBAS.2024.8219.
C. C. P. Hapsari, W. Astuti, and M. D. Purbolaksono, “Naive Bayes Classifier and Word2Vec for Sentiment Analysis on Bahasa Indonesia Cosmetic Product Reviews,” 2021 Int. Conf. Data Sci. Its Appl. ICoDSA 2021, pp. 22–27, 2021, doi: 10.1109/ICODSA53588.2021.9617544.
S. Dey, S. Wasif, D. S. Tonmoy, S. Sultana, J. Sarkar, and M. Dey, “A Comparative Study of Support Vector Machine and Naive Bayes Classifier for Sentiment Analysis on Amazon Product Reviews,” 2020 Int. Conf. Contemp. Comput. Appl. IC3A 2020, pp. 217–220, Feb. 2020, doi: 10.1109/IC3A48958.2020.233300.
L. Mathew and V. R. Bindu, “A Review of Natural Language Processing Techniques for Sentiment Analysis using Pre-trained Models,” Proc. 4th Int. Conf. Comput. Methodol. Commun. ICCMC 2020, pp. 340–345, Mar. 2020, doi: 10.1109/ICCMC48092.2020.ICCMC-00064.
Y. Xu and R. Goodacre, “On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning,” J. Anal. Test., vol. 2, no. 3, pp. 249–262, Jul. 2018, doi: 10.1007/S41664-018-0068-2.
S. Ravichandiran, “Getting started with Google BERT : build and train state-of-the-art natural language processing models using BERT,” p. 324, 2021.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 4171–4186, Oct. 2018, Accessed: Oct. 08, 2024. [Online]. Available: https://arxiv.org/abs/1810.04805v2
A. Nayak, H. Timmapathini, K. Ponnalagu, and V. Gopalan Venkoparao, “Domain adaptation challenges of BERT in tokenization and sub-word representations of Out-of-Vocabulary words,” pp. 1–5, Nov. 2020, doi: 10.18653/V1/2020.INSIGHTS-1.1.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 757–770, Nov. 2020, doi: 10.18653/v1/2020.coling-main.66.
H. D. Sharma and P. Goyal, “An Analysis of Sentiment: Methods, Applications, and Challenges †,” Eng. Proc., vol. 59, no. 1, 2023, doi: 10.3390/ENGPROC2023059068.
K. S. Nugroho, A. Y. Sukmadewa, H. Wuswilahaken Dw, F. A. Bachtiar, and N. Yudistira, “BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps Reviews,” ACM Int. Conf. Proceeding Ser., pp. 258–264, Jul. 2021, doi: 10.1145/3479645.3479679.
L. R. Aini, E. Nurfadhilah, A. Jarin, A. Santosa, and M. T. Uliniansyah, “Enhancing Sentiment Analysis Models through Multi-Technique Data Augmentation: A Study with IndoBERT,” Proc. - 2023 10th Int. Conf. Comput. Control. Informatics its Appl. Explor. Power Data Leveraging Inf. to Drive Digit. Innov. IC3INA 2023, pp. 137–142, 2023, doi: 10.1109/IC3INA60834.2023.10285775.
“Deep Learning-Based Approaches for Sentiment Analysis - Google Books.” https://www.google.co.id/books/edition/Deep_Learning_Based_Approaches_for_Senti/tSTMDwAAQBAJ?hl=en&gbpv=1 (accessed Oct. 08, 2024).
S. Saadah, K. M. Auditama, A. A. Fattahila, F. I. Amorokhman, A. Aditsania, and A. A. Rohmawati, “Implementation of BERT, IndoBERT, and CNN-LSTM in Classifying Public Opinion about COVID-19 Vaccine in Indonesia,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 4, pp. 648–655, Aug. 2022, doi: 10.29207/RESTI.V6I4.4215.
L. Yang, Y. Li, J. Wang, and R. S. Sherratt, “Sentiment Analysis for E-commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning,” IEEE Access, vol. 8, pp. 23522–23530, 2020, doi: 10.1109/ACCESS.2020.2969854.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Klasifikasi Sentimen Ulasan Produk pada Platform E-Commerce di Indonesia dengan Menggunakan Model Pre-Trained IndoBERT
Pages: 2491-2500
Copyright (c) 2025 Bayu Puspito Aji, Christian Sri Kusuma Aditya

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).