Perbandingan Metode TF-IDF dan Bag of Words dalam Analisis Sentimen Diet Kopi Americano di Media Sosial Twitter Menggunakan Naïve Bayes


  • Rahmatika Suryanti Universitas Mercu Buana Yogyakarta, Yogyakarta, Indonesia
  • Putri Prasetyaningrum * Mail Universitas Mercu Buana Yogyakarta, Yogyakarta, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; Coffee Diet; Naïve Bayes; TF-IDF; Bag of Words

Abstract

The popularity of diet coffee, particularly the Americano variant, has risen alongside the growing trend of healthy lifestyles in society. This phenomenon has led to various public opinions circulating on social media, which need to be analyzed to better understand consumer perceptions. This study compares two commonly used text feature representation methods, Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW), in sentiment analysis using the Naïve Bayes algorithm. Using relevant keywords, data were collected from Twitter and underwent preprocessing stages including case folding, cleansing, tokenizing, stopword removal, and stemming. Sentiment labeling was conducted manually based on keyword indicators, and the data were classified into positive, negative, and neutral categories. The evaluation results show that the TF-IDF model achieved an accuracy of 85%, outperforming BoW which obtained 64%. This performance gap indicates that the choice of feature representation method plays a crucial role in the success of sentiment classification. This research is expected to serve as a reference for optimizing text representation techniques to analyze public opinion on social media, particularly concerning diet products and low-calorie beverages.

Downloads

Download data is not yet available.

References

M. Rizqi, A. Rustiawan, and P. T. Prasetyaningrum, “Analisis Sentimen Terhadap Klinik Natasha Skincare di Yogyakarta Dengan Metode Google Review,” J. Inf. Technol. Ampera, vol. 5, no. 1, pp. 2774–2121, 2024, doi: 10.51519/journalita.v5i1.556.

Ni Luh Wayan Sita Pujasari and Ni Made Widi Astuti, “Potensi Biji Kopi Hijau (Green Bean Coffee) Sebagai Suplemen Penurun Berat Badan,” Pros. Work. dan Semin. Nas. Farm., vol. 1, no. 1, pp. 213–229, 2023, doi: 10.24843/wsnf.2022.v01.i01.p18.

A. H. Nasution, D. A. Fitri, M. S. Qolbu, and A. Sunarto, “Prosiding Seminar Nasional Manajemen Analisis Komunitas Penggemar Kopi : Dinamika Sosial dan Pengaruh Terhadap Tren Konsumsi Kopi,” Pros. Semin. Nas. Manaj., vol. 2, no. 1, pp. 251–256, 2023.

A. V. Sirotkin and A. Kolesarova, “The Anti-Obesity and Health-Promoting Effects of Tea and Coffee,” Physiol. Res., vol. 70, no. 2, pp. 161–168, 2021, doi: 10.33549/physiolres.934674.

Dedy Sugiarto, Ema Utami, and Ainul Yaqin, “Perbandingan Kinerja Model TF-IDF dan BOW untuk Klasifikasi Opini Publik Tentang Kebijakan BLT Minyak Goreng,” J. Tek. Ind., vol. 12, no. 3, pp. 272–277, Dec. 2022, doi: 10.25105/jti.v12i3.15669.

K. Tri Putra, M. Amin Hariyadi, and C. Crysdian, “Perbandingan Feature Extraction Tf-Idf Dan Bow Untuk Analisis Sentimen Berbasis Svm,” J. Cahaya MAndalika, vol. 3, no. 2, p. 1449, 2023, doi: 10.36312/jcm.v3i2.

A. Supoyo and P. T. Prasetyaningrum, “Analisis Data Mining Untuk Memprediksi Lama Perawatan Pasien Covid-19 Di DIY,” Bianglala Inform., vol. 10, no. 1, pp. 21–29, 2022, doi: 10.31294/bi.v10i1.11890.

A. E. Perkasa and A. N. Putri, “Penerapan Naïve Bayes Untuk Analisis Sentimen Pada Ulasan Aplikasi Mobile Legends,” Build. Informatics, Technol. Sci., vol. 6, no. 4, p. 2152−2164, 2025, doi: 10.47065/bits.v6i4.6507.

P. T. Prasetyaningrum, P. Purwanto, and A. F. Rochim, “Consumer Behavior Analysis in Gamified Mobile Banking : Clustering and Classifier Evaluation,” J. Syst. Manag. Sci., vol. 15, no. 2, pp. 290–308, 2025, doi: 10.33168/JSMS.2025.0218.

M. Windarti and P. T. Prasetyaninrum, “Prediction Analysis Student Graduate Using Multilayer Perceptron,” Atl. Press, vol. 440, no. Icobl 2019, pp. 53–57, 2020, doi: 10.2991/assehr.k.200521.011.

P. T. Prasetyaningrum, N. T. Kadir, and A. Y. Chandra, “Comparison Of Support Vector Machine Radial Base And Linear Kernel Functions For Mobile Banking Customer Satisfaction Analysis,” Int. J. Comput. Netw. Secur. Inf. Syst., vol. 4, no. 1, pp. 10–16, 2022, doi: 10.33005/ijconsist.v4i1.75.

P. T. Prasetyaningrum, P. Purwanto, and A. F. Rochim, “Enhancing Element Game Classification: Effective Techniques for Handling Imbalanced Classes,” Int. J. Intell. Eng. Syst., vol. 17, no. 1, pp. 555–571, 2024, doi: 10.22266/ijies2024.0229.47.

A. U. Haspriyanti and P. W. Prasetyaningrum, “Penerapan Data Mining Untuk Prediksi Layanan Produk Indihome Menggunakan Metode K-Nearst Neighbor Arwa,” Inf. Syst. Artif. Intell., vol. 20, no. 2, pp. 100–107, 2021, doi: 10.26486/jisai.v1i2.17.

P. Taqwa Prasetyaningrun, I. Pratama, and A. Yakobus Chandra, “Implementation Of Machine Learning To Determine The Best Employees Using Random Forest Method,” Ijconsist Journals, vol. 2, no. 02, pp. 53–59, 2021, doi: 10.33005/ijconsist.v2i02.43.

B. Darmawan, A. Dwi Laksito, M. Resa, A. Yudianto, and A. Sidauruk, “Krea-TIF: Jurnal Teknik Informatika Analisis Perbandingan Ekstraksi Fitur Teks pada Sentimen Analisis Kenaikan Harga BBM,” J. Mhs. Inform., vol. 11, no. 1, pp. 53–63, 2023, doi: 10.32832/krea-tif.v11i1.13819.

A. Fauzi and A. H. Yunial, “Analisis Sentimen Pada Media Sosial Menggunakan Perbandingan Algoritma Data Mining,” J. Edukasi dan Penelit. Inform., vol. 10, no. 2, p. 277, 2024, doi: 10.26418/jp.v10i2.76024.

K. Hadi and E. Utami, “Analysis of K-NN with the Integration of Bag of Words, TF-IDF, and N-Grams for Hate Speech Classification on Twitter,” JUITA J. Inform., vol. 12, no. 2, p. 289, Nov. 2024, doi: 10.30595/juita.v12i2.23829.

M. T. Razaq, D. Nurjanah, and H. Nurrahmi, “Analisis Sentimen Review Film Menggunakan Naive Bayes Classifier dengan Fitur TF-IDF,” e-Proceeding Eng., vol. 10, no. 2, pp. 1698–1712, 2023.

D. Darwis, N. Siskawati, and Z. Abidin, “Penerapan Algoritma Naive Bayes Untuk Analisis Sentimen Review Data Twitter Bmkg Nasional,” J. Tekno Kompak, vol. 15, no. 1, p. 131, 2021, doi: 10.33365/jtk.v15i1.744.

A. Gerliandeva, Y. H. Chrisnanto, and H. Ashaury, “Optimasi Klasifikasi Sentimen pada Komentar Online menggunakan Multinomial Naïve Bayes dan Ekstraksi Fitur TF-IDF serta N-grams Optimization of Sentiment Classification on Online Comments using Multinomial Naïve Bayes and TF-IDF Feature Extraction and N-g,” J. Pekommas, vol. 9, no. 2, pp. 259–272, 2024, doi: 10.56873/jpkm.v9i2.5585.

T. A. Dewi and E. Mailoa, “Perbandingan Implementasi Metode Smote Pada Algoritma Support Vector Machine (Svm) Dalam Analisis Sentimen Opini Masyarakat Tentang Mixue,” J. Indones. Manaj. Inform. dan Komun., vol. 4, no. 3, pp. 849–855, 2023, doi: 10.35870/jimik.v4i3.289.

F. M. Lubis and M. Ikhsan, “Analisis Sentimen Terhadap Program Kampanye Tabrak Prof Pada Media Sosial X Dengan Menggunakan Metode Support Vector Machine,” JSiI J. Sist. Inf., vol. 12, no. 1, pp. 86–92, 2025, doi: 10.30656/jsii.v11i2.9065.

L. Efrizoni, S. Defit, M. Tajuddin, and A. Anggrawan, “Komparasi Ekstraksi Fitur dalam Klasifikasi Teks Multilabel Menggunakan Algoritma Machine Learning,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 21, no. 3, pp. 653–666, 2022, doi: 10.30812/matrik.v21i3.1851.

H. Firda et al., “Perbandingan Pelabelan Rating - based dan Inset Lexicon - based dalam Analisis Sentimen Menggunakan SVM ( Studi Kasus : Ulasan Aplikasi GoBiz di Google Play Store ),” Sist. J. Sist. Inf., vol. 14, no. 2, pp. 516–528, 2025, doi: 10.32520/stmsi.v14i2.4795.

D. Septiani and I. Isabela, “Analisis Term Frequency Inverse Document Frequency (TF-IDF) Dalam Temu Kembali Informasi Pada Dokumen Teks,” SINTESIA J. Sist. dan Teknol. Inf. Indones., vol. 1, no. 2, pp. 81–88, 2023, doi: 10.37058/innovatics.v6i2.12404.

I. K. Dwiprayoga and M. Agung, “Komparasi Ekstraksi Fitur BoW dan TF-IDF untuk Klasifikasi SMS Menggunakan Naive Bayes,” J. Nas. Teknol. Inf. dan Apl., vol. 3, no. 2, pp. 247–254, 2025, doi: 10.24843/JNATIA.2025.v03.i02.p03.

A. Saekhu, D. Intan, and S. Saputra, “Enhancing Student Sentiment Classification on AI in Education using SMOTE and Naive Bayes,” Build. Informatics, Technol. Sci., vol. 6, no. 4, pp. 2165–2174, 2025, doi: 10.47065/bits.v6i4.6469.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Perbandingan Metode TF-IDF dan Bag of Words dalam Analisis Sentimen Diet Kopi Americano di Media Sosial Twitter Menggunakan Naïve Bayes

Dimensions Badge
Article History
Submitted: 2025-04-30
Published: 2025-06-01
Abstract View: 1115 times
PDF Download: 501 times
How to Cite
Suryanti, R., & Prasetyaningrum, P. (2025). Perbandingan Metode TF-IDF dan Bag of Words dalam Analisis Sentimen Diet Kopi Americano di Media Sosial Twitter Menggunakan Naïve Bayes. Building of Informatics, Technology and Science (BITS), 7(1), 104-115. https://doi.org/10.47065/bits.v7i1.7244
Section
Articles