Perbandingan Metode TF-IDF dan Bag of Words dalam Analisis Sentimen Diet Kopi Americano di Media Sosial Twitter Menggunakan Naïve Bayes
Abstract
The popularity of diet coffee, particularly the Americano variant, has risen alongside the growing trend of healthy lifestyles in society. This phenomenon has led to various public opinions circulating on social media, which need to be analyzed to better understand consumer perceptions. This study compares two commonly used text feature representation methods, Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW), in sentiment analysis using the Naïve Bayes algorithm. Using relevant keywords, data were collected from Twitter and underwent preprocessing stages including case folding, cleansing, tokenizing, stopword removal, and stemming. Sentiment labeling was conducted manually based on keyword indicators, and the data were classified into positive, negative, and neutral categories. The evaluation results show that the TF-IDF model achieved an accuracy of 85%, outperforming BoW which obtained 64%. This performance gap indicates that the choice of feature representation method plays a crucial role in the success of sentiment classification. This research is expected to serve as a reference for optimizing text representation techniques to analyze public opinion on social media, particularly concerning diet products and low-calorie beverages.
Downloads
References
M. Rizqi, A. Rustiawan, and P. T. Prasetyaningrum, “Analisis Sentimen Terhadap Klinik Natasha Skincare di Yogyakarta Dengan Metode Google Review,” J. Inf. Technol. Ampera, vol. 5, no. 1, pp. 2774–2121, 2024, doi: 10.51519/journalita.v5i1.556.
Ni Luh Wayan Sita Pujasari and Ni Made Widi Astuti, “Potensi Biji Kopi Hijau (Green Bean Coffee) Sebagai Suplemen Penurun Berat Badan,” Pros. Work. dan Semin. Nas. Farm., vol. 1, no. 1, pp. 213–229, 2023, doi: 10.24843/wsnf.2022.v01.i01.p18.
A. H. Nasution, D. A. Fitri, M. S. Qolbu, and A. Sunarto, “Prosiding Seminar Nasional Manajemen Analisis Komunitas Penggemar Kopi : Dinamika Sosial dan Pengaruh Terhadap Tren Konsumsi Kopi,” Pros. Semin. Nas. Manaj., vol. 2, no. 1, pp. 251–256, 2023.
A. V. Sirotkin and A. Kolesarova, “The Anti-Obesity and Health-Promoting Effects of Tea and Coffee,” Physiol. Res., vol. 70, no. 2, pp. 161–168, 2021, doi: 10.33549/physiolres.934674.
Dedy Sugiarto, Ema Utami, and Ainul Yaqin, “Perbandingan Kinerja Model TF-IDF dan BOW untuk Klasifikasi Opini Publik Tentang Kebijakan BLT Minyak Goreng,” J. Tek. Ind., vol. 12, no. 3, pp. 272–277, Dec. 2022, doi: 10.25105/jti.v12i3.15669.
K. Tri Putra, M. Amin Hariyadi, and C. Crysdian, “Perbandingan Feature Extraction Tf-Idf Dan Bow Untuk Analisis Sentimen Berbasis Svm,” J. Cahaya MAndalika, vol. 3, no. 2, p. 1449, 2023, doi: 10.36312/jcm.v3i2.
A. Supoyo and P. T. Prasetyaningrum, “Analisis Data Mining Untuk Memprediksi Lama Perawatan Pasien Covid-19 Di DIY,” Bianglala Inform., vol. 10, no. 1, pp. 21–29, 2022, doi: 10.31294/bi.v10i1.11890.
A. E. Perkasa and A. N. Putri, “Penerapan Naïve Bayes Untuk Analisis Sentimen Pada Ulasan Aplikasi Mobile Legends,” Build. Informatics, Technol. Sci., vol. 6, no. 4, p. 2152−2164, 2025, doi: 10.47065/bits.v6i4.6507.
P. T. Prasetyaningrum, P. Purwanto, and A. F. Rochim, “Consumer Behavior Analysis in Gamified Mobile Banking : Clustering and Classifier Evaluation,” J. Syst. Manag. Sci., vol. 15, no. 2, pp. 290–308, 2025, doi: 10.33168/JSMS.2025.0218.
M. Windarti and P. T. Prasetyaninrum, “Prediction Analysis Student Graduate Using Multilayer Perceptron,” Atl. Press, vol. 440, no. Icobl 2019, pp. 53–57, 2020, doi: 10.2991/assehr.k.200521.011.
P. T. Prasetyaningrum, N. T. Kadir, and A. Y. Chandra, “Comparison Of Support Vector Machine Radial Base And Linear Kernel Functions For Mobile Banking Customer Satisfaction Analysis,” Int. J. Comput. Netw. Secur. Inf. Syst., vol. 4, no. 1, pp. 10–16, 2022, doi: 10.33005/ijconsist.v4i1.75.
P. T. Prasetyaningrum, P. Purwanto, and A. F. Rochim, “Enhancing Element Game Classification: Effective Techniques for Handling Imbalanced Classes,” Int. J. Intell. Eng. Syst., vol. 17, no. 1, pp. 555–571, 2024, doi: 10.22266/ijies2024.0229.47.
A. U. Haspriyanti and P. W. Prasetyaningrum, “Penerapan Data Mining Untuk Prediksi Layanan Produk Indihome Menggunakan Metode K-Nearst Neighbor Arwa,” Inf. Syst. Artif. Intell., vol. 20, no. 2, pp. 100–107, 2021, doi: 10.26486/jisai.v1i2.17.
P. Taqwa Prasetyaningrun, I. Pratama, and A. Yakobus Chandra, “Implementation Of Machine Learning To Determine The Best Employees Using Random Forest Method,” Ijconsist Journals, vol. 2, no. 02, pp. 53–59, 2021, doi: 10.33005/ijconsist.v2i02.43.
B. Darmawan, A. Dwi Laksito, M. Resa, A. Yudianto, and A. Sidauruk, “Krea-TIF: Jurnal Teknik Informatika Analisis Perbandingan Ekstraksi Fitur Teks pada Sentimen Analisis Kenaikan Harga BBM,” J. Mhs. Inform., vol. 11, no. 1, pp. 53–63, 2023, doi: 10.32832/krea-tif.v11i1.13819.
A. Fauzi and A. H. Yunial, “Analisis Sentimen Pada Media Sosial Menggunakan Perbandingan Algoritma Data Mining,” J. Edukasi dan Penelit. Inform., vol. 10, no. 2, p. 277, 2024, doi: 10.26418/jp.v10i2.76024.
K. Hadi and E. Utami, “Analysis of K-NN with the Integration of Bag of Words, TF-IDF, and N-Grams for Hate Speech Classification on Twitter,” JUITA J. Inform., vol. 12, no. 2, p. 289, Nov. 2024, doi: 10.30595/juita.v12i2.23829.
M. T. Razaq, D. Nurjanah, and H. Nurrahmi, “Analisis Sentimen Review Film Menggunakan Naive Bayes Classifier dengan Fitur TF-IDF,” e-Proceeding Eng., vol. 10, no. 2, pp. 1698–1712, 2023.
D. Darwis, N. Siskawati, and Z. Abidin, “Penerapan Algoritma Naive Bayes Untuk Analisis Sentimen Review Data Twitter Bmkg Nasional,” J. Tekno Kompak, vol. 15, no. 1, p. 131, 2021, doi: 10.33365/jtk.v15i1.744.
A. Gerliandeva, Y. H. Chrisnanto, and H. Ashaury, “Optimasi Klasifikasi Sentimen pada Komentar Online menggunakan Multinomial Naïve Bayes dan Ekstraksi Fitur TF-IDF serta N-grams Optimization of Sentiment Classification on Online Comments using Multinomial Naïve Bayes and TF-IDF Feature Extraction and N-g,” J. Pekommas, vol. 9, no. 2, pp. 259–272, 2024, doi: 10.56873/jpkm.v9i2.5585.
T. A. Dewi and E. Mailoa, “Perbandingan Implementasi Metode Smote Pada Algoritma Support Vector Machine (Svm) Dalam Analisis Sentimen Opini Masyarakat Tentang Mixue,” J. Indones. Manaj. Inform. dan Komun., vol. 4, no. 3, pp. 849–855, 2023, doi: 10.35870/jimik.v4i3.289.
F. M. Lubis and M. Ikhsan, “Analisis Sentimen Terhadap Program Kampanye Tabrak Prof Pada Media Sosial X Dengan Menggunakan Metode Support Vector Machine,” JSiI J. Sist. Inf., vol. 12, no. 1, pp. 86–92, 2025, doi: 10.30656/jsii.v11i2.9065.
L. Efrizoni, S. Defit, M. Tajuddin, and A. Anggrawan, “Komparasi Ekstraksi Fitur dalam Klasifikasi Teks Multilabel Menggunakan Algoritma Machine Learning,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 21, no. 3, pp. 653–666, 2022, doi: 10.30812/matrik.v21i3.1851.
H. Firda et al., “Perbandingan Pelabelan Rating - based dan Inset Lexicon - based dalam Analisis Sentimen Menggunakan SVM ( Studi Kasus : Ulasan Aplikasi GoBiz di Google Play Store ),” Sist. J. Sist. Inf., vol. 14, no. 2, pp. 516–528, 2025, doi: 10.32520/stmsi.v14i2.4795.
D. Septiani and I. Isabela, “Analisis Term Frequency Inverse Document Frequency (TF-IDF) Dalam Temu Kembali Informasi Pada Dokumen Teks,” SINTESIA J. Sist. dan Teknol. Inf. Indones., vol. 1, no. 2, pp. 81–88, 2023, doi: 10.37058/innovatics.v6i2.12404.
I. K. Dwiprayoga and M. Agung, “Komparasi Ekstraksi Fitur BoW dan TF-IDF untuk Klasifikasi SMS Menggunakan Naive Bayes,” J. Nas. Teknol. Inf. dan Apl., vol. 3, no. 2, pp. 247–254, 2025, doi: 10.24843/JNATIA.2025.v03.i02.p03.
A. Saekhu, D. Intan, and S. Saputra, “Enhancing Student Sentiment Classification on AI in Education using SMOTE and Naive Bayes,” Build. Informatics, Technol. Sci., vol. 6, no. 4, pp. 2165–2174, 2025, doi: 10.47065/bits.v6i4.6469.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Perbandingan Metode TF-IDF dan Bag of Words dalam Analisis Sentimen Diet Kopi Americano di Media Sosial Twitter Menggunakan Naïve Bayes
Pages: 104-115
Copyright (c) 2025 Rahmatika Suryanti, Putri Prasetyaningrum

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















