Perbandingan Naïve Bayes dan Support Vector Machine Berbasis Term Frequency−Inverse Document Frequency pada Analisis Sentimen Ulasan Produk Afiliasi Lintas Platform TikTok dan Shopee
Abstract
The growth of affiliate marketing on digital platforms, particularly TikTok and Shopee, has led to a rapid increase in consumer reviews that can be leveraged as actionable insights for businesses. However, reviews across platforms exhibit different linguistic characteristics: Shopee reviews tend to be more repetitive and transactional, whereas TikTok reviews are more informal, rich in slang, and noisier. This difference creates a research gap because sentiment classification performance may vary across platforms, while comparative studies on cross-platform affiliate reviews remain limited. This study aims to analyze and compare the performance of Multinomial Naïve Bayes and Support Vector Machine in identifying positive and negative sentiment polarity in TikTok and Shopee affiliate product reviews. Data were collected via web scraping during December 2025–January 2026, yielding 5,502 raw reviews. After text preprocessing (case folding, regex-based cleaning, normalization, stopword removal, and stemming using Sastrawi), 4,593 clean reviews were obtained. Lexicon-based automatic labeling with negation handling produced a binary dataset of 3,314 reviews (2,729 positive and 585 negative), indicating class imbalance; therefore, no data balancing was applied and evaluation emphasized precision, recall, and F1-score in addition to accuracy. Feature representation used Term Frequency–Inverse Document Frequency, and the dataset was split using an 80:20 hold-out scheme (2,651 training and 663 testing instances). Experimental results show that the Support Vector Machine achieved higher performance (95.93% accuracy; 0.81 negative-class F1) than Multinomial Naïve Bayes (89.14% accuracy; 0.12 negative-class F1). This superiority is related to the ability of Support Vector Machine to learn a maximum-margin hyperplane in the high-dimensional and sparse Term Frequency–Inverse Document Frequency feature space, making it more robust to linguistic variation and noise than the probabilistic Naïve Bayes approach, which is more sensitive to majority-class dominance.
Downloads
References
N. I. Prestyasih and S. R. H. Hati, “The Role of Social Commerce Trust and Satisfaction on TikTok Consumer Purchasing Behavior,” J. Ilm. Manaj. Kesatuan, vol. 13, no. 4, pp. 2817–2826, Jul. 2025, doi: 10.37641/jimkes.v13i4.3455.
S. Zuhri, N. Nawari, and M. A. Al Mubarok, “Pengaruh Online Customer Review, Affiliate Marketing Terhadap Keputusan Pembelian Di Tiktok Shop,” J-MACC J. Manag. Account., vol. 6, no. 1, pp. 128–140, Apr. 2023, doi: 10.52166/j-macc.v6i1.9651.
S. Brilianita and R. Sulistyowati, “Affiliate Marketing terhadap Minat Beli Mahasiswa di TikTok Shop,” JPEKA J. Pendidik. Ekon. Manaj. dan Keuang., vol. 7, no. 2, 2023, doi: 10.26740/jpeka.v7n2.p157-167.
P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Soc. Netw. Anal. Min., vol. 11, no. 1, p. 81, Dec. 2021, doi: 10.1007/s13278-021-00776-6.
A. Ligthart, C. Catal, and B. Tekinerdogan, “Systematic reviews in sentiment analysis: a tertiary study,” Artif. Intell. Rev., vol. 54, no. 7, pp. 4997–5053, Oct. 2021, doi: 10.1007/s10462-021-09973-3.
S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep Learning--based Text Classification,” ACM Comput. Surv., vol. 54, no. 3, pp. 1–40, Apr. 2022, doi: 10.1145/3439726.
A. Palanivinayagam, C. Z. El-Bayeh, and R. Damaševičius, “Twenty Years of Machine-Learning-Based Text Classification: A Systematic Review,” Algorithms, vol. 16, no. 5, p. 236, Apr. 2023, doi: 10.3390/a16050236.
J. O. Leandro and M. I. Fianty, “Evaluation of Sentiment Analysis Methods for Social Media Applications: A Comparison of Support Vector Machines and Naïve Bayes,” JOIV Int. J. Informatics Vis., vol. 9, no. 2, p. 796, Mar. 2025, doi: 10.62527/joiv.9.2.2905.
Friska Aditia Indriyani, Ahmad Fauzi, and Sutan Faisal, “Analisis sentimen aplikasi tiktok menggunakan algoritma naïve bayes dan support vector machine,” TEKNOSAINS J. Sains, Teknol. dan Inform., vol. 10, no. 2, pp. 176–184, Jul. 2023, doi: 10.37373/tekno.v10i2.419.
H. Barus, I. N. Fajri, and Y. Pristyanto, “Sentiment Classification Analysis of Tokopedia Reviews Using TF-IDF, SMOTE, and Traditional Machine Learning Models”, JAIC, vol. 9, no. 5, pp. 2552–2561, Oct. 2025. doi: 10.30871/jaic.v9i5.10524
K. P. Harmandini and K. M. L, “Analysis of TF-IDF and TF-RF Feature Extraction on Product Review Sentiment,” Sinkron, vol. 8, no. 2, pp. 929–937, Mar. 2024, doi: 10.33395/sinkron.v8i2.13376.
I. F. Rozi, I. Maulidia, M. Hani’ah, R. Arianto, D. R. Yunianto, and A. Y. Ananta, “Comparison of Feature Extraction in Support Vector Machine (SVM) Based Sentiment Analysis System,” J. Ilm. Kursor, vol. 13, no. 1, pp. 1–12, Jul. 2025, doi: 10.21107/kursor.v13i1.417.
Z. Rifa’i and B. P. Mukti, “Weakly Supervised Sentiment Analysis of Indonesian Rural Tourism Reviews: A TF-IDF Baseline for Melung Tourism Village,” Edu Komputika J., vol. 12, no. 1, pp. 48–60, 2025, doi: 10.15294/edukom.v12i1.31893.
D. Musfiroh, U. Khaira, P. E. P. Utomo, and T. Suratno, “Analisis Sentimen terhadap Perkuliahan Daring di Indonesia dari Twitter Dataset Menggunakan InSet Lexicon,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 1, no. 1, pp. 24–33, Mar. 2021, doi: 10.57152/malcom.v1i1.20.
D. Mustikasari, I. Widaningrum, R. Arifin, and W. H. E. Putri, “Comparison of Effectiveness of Stemming Algorithms in Indonesian Documents,” in Proc. 2nd Borobudur Int. Symp. Sci. Technol. (BIS-STE 2020), Atlantis Press, 2021, pp. 154–158, doi: 10.2991/aer.k.210810.025.
P. Mukherjee, Y. Badr, S. Doppalapudi, S. M. Srinivasan, R. S. Sangwan, and R. Sharma, “Effect of Negation in Sentences on Sentiment Analysis and Polarity Detection,” Procedia Comput. Sci., vol. 185, pp. 370–379, 2021, doi: 10.1016/j.procs.2021.05.038.
M. Naldi and S. Petroni, “A Testset-Based Method to Analyse the Negation-Detection Performance of Lexicon-Based Sentiment Analysis Tools,” Computers, vol. 12, no. 1, p. 18, Jan. 2023, doi: 10.3390/computers12010018.
C. Apriansyah Hutagalung and V. Budi Lestari, “Data Mining Approach: K-Means Clustering and Naïve Bayes Classifier for Graduate Quality Analysis,” J-KOMA J. Ilmu Komput. dan Apl., vol. 8, no. 1, pp. 33–42, Jun. 2025, doi: 10.21009/j-koma.v8i1.05.
A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow : concepts, tools, and techniques to build intelligent systems, 3rd ed. Sebastopol, CA, USA: O’Reilly Media, Oct. 2022.
Israt Jahan, Md Nakibul Islam, Md Mahadi Hasan, and Md Rafiuddin Siddiky, “Comparative analysis of machine learning algorithms for sentiment classification in social media text,” World J. Adv. Res. Rev., vol. 23, no. 3, pp. 2842–2852, Sep. 2024, doi: 10.30574/wjarr.2024.23.3.2983.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Perbandingan Naïve Bayes dan Support Vector Machine Berbasis Term Frequency−Inverse Document Frequency pada Analisis Sentimen Ulasan Produk Afiliasi Lintas Platform TikTok dan Shopee
Pages: 2573-2585
Copyright (c) 2026 Clara Indriani Putri, Aditia Yudhistira

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















