Klasifikasi Sentimen Ulasan Produk pada Platform E-Commerce di Indonesia dengan Menggunakan Model Pre-Trained IndoBERT


  • Bayu Puspito Aji * Mail Universitas Muhammadiyah Malang, Kota Malang, Indonesia
  • Christian Sri Kusuma Aditya Universitas Muhammadiyah Malang, Kota Malang, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; IndoBERT; Product Reviews; Deep Learning; E-Commerce

Abstract

In today's digital era, sentiment analysis of product reviews on e-commerce platforms is becoming increasingly important, especially on Tokopedia, one of the largest marketplaces in Indonesia. Tokopedia provides facilities for users to leave reviews after making transactions, which play an important role in helping businesses understand customer perceptions of products. This research aims to classify the sentiment of product reviews on Tokopedia using the IndoBERT model and evaluate its performance compared to LSTM-based methods combined with FastText, Glove, and Word2Vec embedding. The LSTM-FastText model in previous research achieved the highest accuracy of 85.08%. In this study, the sentiment classification of product reviews on Tokopedia was carried out with a total of 5400 data and the sentiment classification process was divided into two categories, namely positive and negative, with the division of the dataset into three groups: training, validation, and testing. The contribution in this research is to explore the effectiveness of the IndoBERT model performance compared to previous methods that implement the LSTM model with FastText, Glove, and Word2Vec embedding. Based on the research results, the IndoBERT model achieved an accuracy of 97%, with the same F1-score value for both sentiment categories of 97%. Specifically designed with pre-training on a large Indonesian corpus, IndoBERT is able to understand the context of the text better than the LSTM model used in previous studies. This allows IndoBERT to produce higher accuracy, as it can understand product reviews in Indonesian more effectively.

Downloads

Download data is not yet available.

References

E. H. Muktafin, K. Kusrini, and E. T. Luthfi, “Analisis Sentimen pada Ulasan Pembelian Produk di Marketplace Shopee Menggunakan Pendekatan Natural Language Processing,” J. Eksplora Inform., vol. 10, no. 1, pp. 32–42, Sep. 2020, doi: 10.30864/EKSPLORA.V10I1.390.

R. Hidayansyah, A. Ahmad, and N. I. Nabila, “The impact of consumer reviews and ratings on purchase decisions on the Tokopedia marketplace in Indonesia,” Int. J. Econ. Business, Entrep., vol. 6, no. 2, pp. 140–152, Dec. 2023, doi: 10.23960/IJEBE.V6I2.264.

A. N. Rohman, R. Luviana Musyarofah, E. Utami, and S. Raharjo, “Natural Language Processing on Marketplace Product Review Sentiment Analysis,” 2020 2nd Int. Conf. Cybern. Intell. Syst. ICORIS 2020, Oct. 2020, doi: 10.1109/ICORIS50180.2020.9320827.

R. Hermansyah and R. Sarno, “Sentiment analysis about product and service evaluation of pt telekomunikasi Indonesia tbk from tweets using textblob, naive bayes & K-NN Method,” Proc. - 2020 Int. Semin. Appl. Technol. Inf. Commun. IT Challenges Sustain. Scalability, Secur. Age Digit. Disruption, iSemantic 2020, pp. 511–516, Sep. 2020, doi: 10.1109/ISEMANTIC50169.2020.9234238.

D. Sharma and M. Sabharwal, “Sentiment analysis for social media using SVM classifier of machine learning,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 9 Special Issue 4, pp. 39–47, Jul. 2019, doi: 10.35940/IJITEE.I1107.0789S419.

N. A. Dirfas, V. Rahmayanti, and S. Nastiti, “Perbandingan Kinerja Pre-Trained Word Embedding Terhadap Performa Klasifikasi Sentimen Ulasan Produk Tokopedia Dengan Long Short-Term Memory(LSTM),” Build. Informatics, Technol. Sci., vol. 6, no. 2, pp. 878−889-878−889, Sep. 2024, doi: 10.47065/BITS.V6I2.5634.

M. V. Koroteev, “BERT: A Review of Applications in Natural Language Processing and Understanding,” Mar. 2021, Accessed: Oct. 07, 2024. [Online]. Available: https://arxiv.org/abs/2103.11943v1

M. P. Geetha and D. Karthika Renuka, “Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model,” Int. J. Intell. Networks, vol. 2, pp. 64–69, Jan. 2021, doi: 10.1016/J.IJIN.2021.06.005.

N. N. Qomariyah, T. Sun, and D. Kazakov, “NLP Analysis of COVID-19 Radiology Reports in Indonesian using IndoBERT,” IBIOMED 2022 - Proc. 2022 4th Int. Conf. Biomed. Eng., pp. 65–70, 2022, doi: 10.1109/IBIOMED56408.2022.9988223.

R. Sutoyo, S. Achmad, A. Chowanda, E. W. Andangsari, and S. M. Isa, “PRDECT-ID: Indonesian product reviews dataset for emotions classification tasks,” Data Br., vol. 44, p. 108554, Oct. 2022, doi: 10.1016/J.DIB.2022.108554.

J. Liu et al., “Application of Deep Learning-Based Natural Language Processing in Multilingual Sentiment Analysis,” Mediterr. J. Basic Appl. Sci., vol. 08, no. 02, pp. 243–260, 2024, doi: 10.46382/MJBAS.2024.8219.

C. C. P. Hapsari, W. Astuti, and M. D. Purbolaksono, “Naive Bayes Classifier and Word2Vec for Sentiment Analysis on Bahasa Indonesia Cosmetic Product Reviews,” 2021 Int. Conf. Data Sci. Its Appl. ICoDSA 2021, pp. 22–27, 2021, doi: 10.1109/ICODSA53588.2021.9617544.

S. Dey, S. Wasif, D. S. Tonmoy, S. Sultana, J. Sarkar, and M. Dey, “A Comparative Study of Support Vector Machine and Naive Bayes Classifier for Sentiment Analysis on Amazon Product Reviews,” 2020 Int. Conf. Contemp. Comput. Appl. IC3A 2020, pp. 217–220, Feb. 2020, doi: 10.1109/IC3A48958.2020.233300.

L. Mathew and V. R. Bindu, “A Review of Natural Language Processing Techniques for Sentiment Analysis using Pre-trained Models,” Proc. 4th Int. Conf. Comput. Methodol. Commun. ICCMC 2020, pp. 340–345, Mar. 2020, doi: 10.1109/ICCMC48092.2020.ICCMC-00064.

Y. Xu and R. Goodacre, “On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning,” J. Anal. Test., vol. 2, no. 3, pp. 249–262, Jul. 2018, doi: 10.1007/S41664-018-0068-2.

S. Ravichandiran, “Getting started with Google BERT : build and train state-of-the-art natural language processing models using BERT,” p. 324, 2021.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 4171–4186, Oct. 2018, Accessed: Oct. 08, 2024. [Online]. Available: https://arxiv.org/abs/1810.04805v2

A. Nayak, H. Timmapathini, K. Ponnalagu, and V. Gopalan Venkoparao, “Domain adaptation challenges of BERT in tokenization and sub-word representations of Out-of-Vocabulary words,” pp. 1–5, Nov. 2020, doi: 10.18653/V1/2020.INSIGHTS-1.1.

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 757–770, Nov. 2020, doi: 10.18653/v1/2020.coling-main.66.

H. D. Sharma and P. Goyal, “An Analysis of Sentiment: Methods, Applications, and Challenges †,” Eng. Proc., vol. 59, no. 1, 2023, doi: 10.3390/ENGPROC2023059068.

K. S. Nugroho, A. Y. Sukmadewa, H. Wuswilahaken Dw, F. A. Bachtiar, and N. Yudistira, “BERT Fine-Tuning for Sentiment Analysis on Indonesian Mobile Apps Reviews,” ACM Int. Conf. Proceeding Ser., pp. 258–264, Jul. 2021, doi: 10.1145/3479645.3479679.

L. R. Aini, E. Nurfadhilah, A. Jarin, A. Santosa, and M. T. Uliniansyah, “Enhancing Sentiment Analysis Models through Multi-Technique Data Augmentation: A Study with IndoBERT,” Proc. - 2023 10th Int. Conf. Comput. Control. Informatics its Appl. Explor. Power Data Leveraging Inf. to Drive Digit. Innov. IC3INA 2023, pp. 137–142, 2023, doi: 10.1109/IC3INA60834.2023.10285775.

“Deep Learning-Based Approaches for Sentiment Analysis - Google Books.” https://www.google.co.id/books/edition/Deep_Learning_Based_Approaches_for_Senti/tSTMDwAAQBAJ?hl=en&gbpv=1 (accessed Oct. 08, 2024).

S. Saadah, K. M. Auditama, A. A. Fattahila, F. I. Amorokhman, A. Aditsania, and A. A. Rohmawati, “Implementation of BERT, IndoBERT, and CNN-LSTM in Classifying Public Opinion about COVID-19 Vaccine in Indonesia,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 4, pp. 648–655, Aug. 2022, doi: 10.29207/RESTI.V6I4.4215.

L. Yang, Y. Li, J. Wang, and R. S. Sherratt, “Sentiment Analysis for E-commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning,” IEEE Access, vol. 8, pp. 23522–23530, 2020, doi: 10.1109/ACCESS.2020.2969854.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Klasifikasi Sentimen Ulasan Produk pada Platform E-Commerce di Indonesia dengan Menggunakan Model Pre-Trained IndoBERT

Dimensions Badge
Article History
Submitted: 2025-02-11
Published: 2025-03-07
Abstract View: 75 times
PDF Download: 27 times
How to Cite
Aji, B., & Aditya, C. (2025). Klasifikasi Sentimen Ulasan Produk pada Platform E-Commerce di Indonesia dengan Menggunakan Model Pre-Trained IndoBERT. Building of Informatics, Technology and Science (BITS), 6(4), 2491-2500. https://doi.org/10.47065/bits.v6i4.6968
Issue
Section
Articles