Komparasi Ekstraksi Fitur TF-IDF dan Word2Vec pada Naïve Bayes untuk analisis Sentimen Pembangunan IKN di YouTube


  • Mu. Aldi Rahmad Fahrozi Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
  • Taghfirul Azhima Yoga Siswa * Mail Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
  • Naufal Azmi Verdikha Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; IKN; The Nation's Capital; Nusantara; Naïve Bayes; TF-IDF; Word2Vec

Abstract

The development of Indonesia’s New Capital City (IKN) has generated diverse public responses on social media, particularly YouTube, making sentiment analysis necessary to map public perceptions. Previous studies have reported relatively low classification accuracy, reaching only 60%, indicating the need for more effective approaches to improve performance. This study aims to compare the performance of the Naïve Bayes algorithm in classifying public sentiment toward the IKN development using two feature extraction methods, namely TF-IDF and Word2Vec. The data were collected from YouTube comments and processed through preprocessing stages, expert-based labeling, and evaluation using 10-Fold Cross Validation. The results show that the TF-IDF-based Multinomial Naïve Bayes model achieves the best performance with an accuracy of 83%, a positive recall of 82%, and a negative F1-score of 85%, outperforming the Word2Vec-based Gaussian Naïve Bayes model, which attains an accuracy of 82% with a lower positive recall of 76%. These findings confirm that TF-IDF is more effective and stable in handling short-text comment characteristics than Word2Vec, which requires a larger corpus for optimal semantic representation.

Downloads

Download data is not yet available.

References

G. A. Saputri and D. Alita, “Analisis Sentimen Twitter Terhadap Pemindahan Ibu Kota Negara Menggunakan Support Vector Machine,” J. Inform. J. Pengemb. IT, vol. 9, no. 3, pp. 213–223, 2024, doi: 10.30591/jpit.v9i3.6612.

F. Zamzami, R. Hidayat, and R. Fathonah, “Penerapan Algoritma Naive Bayes Classifier Untuk Analisis Sentimen Komentar Twitter Proyek Pembagunan Ikn,” Fakt. Exacta, vol. 17, no. 1, pp. 47–57, 2024, doi: 10.30998/faktorexacta.v17i1.22265.

M. Dimas, R. Chasis, D. Azahari, and M. I. Sa’ad, “Analisis Sentimen Terhadap Kontroversi Pembangunan IKN di Media Sosial Twitter Menggunakan Metode Naïve Bayes,” Bull. Inf. Technol., vol. 6, no. 2, pp. 97–108, 2025, doi: 10.47065/bit.v5i2.1993.

F. Fitriyadi and A. Astikasari, “Analisis Sentimen Masyarakat Terhadap Kebijakan Kenaikan UMK 6 , 5 % Menggunakan Metode Naive Bayes,” J. Ris. Sist. dan Teknol. Inf., vol. 3, no. 1, pp. 26–35, 2024, doi: 10.30787.

A. W. Nyoman, N. M. A. E. D. Wirastuti, and I. B. G. Manuaba, “Analisis Sentimen Tanggapan Masyarakat Tentang Garuda IKN Menggunakan Metode Naive Bayes,” Decod. J. Pendidik. Teknol. Inf., vol. 5, no. 1, pp. 27–40, 2025, doi: 10.51454/decode.v5i1.860.

K. K. Safra and E. Zuliarso, “Analisis Sentimen Terhadap Pelaksanaan Pilkada 2024 Pada Media Sosial Youtube Menggunakan Metode Decision Tree,” J. Inform. Teknol. dan Sains, vol. 7, no. 1, pp. 117–126, Feb. 2025, doi: 10.51401/jinteks.v7i1.5295.

C. A. Misrun, E. Haerani, M. Fikry, and E. Budianita, “Analisis sentimen komentar youtube terhadap Anies Baswedan sebagai bakal calon presiden 2024 menggunakan metode naive bayes classifier,” J. Comput. Sci. Inf. Technol., vol. 5, no. 2, pp. 358–366, 2024, doi: 10.37859.

A. A. M. Putra, Islamiyah, and L. J. Muhammad, “Analisis Sentimen Pengguna Youtube Terhadap Uang Baru Tahun Emisi 2022 Menggunakan Metode Naïve Bayes Classifier,” Adopsi Teknol. dan Sist. Inf., vol. 3, no. 1, pp. 17–27, 2024, doi: 10.30872/atasi.v3i1.1177.

A. Yusuf, A. Rizani, R. Fitri, K. Nursyaiful Priyo Pamungkas, and W. Arifha Saputra, “Sentimen Positif Atau Negatif: Perspektif Masyarakat Terhadap Pemindahan Ibu Kota Negara Positive or Negative Sentiment: Public Perspectives on the Relocation of the National Capital,” J. Masy. Indones., vol. 50, no. 2, pp. 277–300, 2024, doi: 10.55981/jmi.2024.8842.

N. A. Laia and S. P. Barus, “Analisis Sentimen YouTube: ‘Di Balik Ambisi Jokowi dalam IKN,’” J. Pustaka AI (Pusat Akses Kaji. Teknol. Artif. Intell., vol. 5, no. 1, pp. 07–12, 2025, doi: 10.55382/jurnalpustakaai.v5i1.891.

N. Hadi and D. Sugiarto, “Analisis Sentimen Pembangunan IKN pada Media Sosial X Menggunakan Algoritma SVM, Logistic Regression dan Naïve Bayes,” J. Inform. J. Pengemb. IT, vol. 10, no. 1, pp. 37–49, 2025, doi: 10.30591/jpit.v10i1.7106.

A. Lia, A. Rahim, and T. A. Yoga Siswa, “Analisis Sentimen Aplikasi Mysiloam Menggunakan Metode Naïve Bayes,” J. Inform. dan Tek. Elektro Terap., vol. 13, no. 1, 2025, doi: 10.23960/jitet.v13i1.5997.

Nurfiyah and R. Nurfan, “Penerapan Metode Natural Language Processing (NLP) Pada Question Answering System Untuk Media Informasi Mahasiswa Universitas Bhayangkara Jakarta Raya,” J. Inf. Inf. Secur., vol. 4, no. 2, pp. 175–186, 2023.

H. Wijaya and N. Hayati, “Natural Language Processing ( NLP ) Untuk Analisis Sentimen Ulasan Seblak Bandung Pedas Kudus,” J. Bus. Audit Inf. Syst., vol. 8, no. 1, pp. 13–22, 2025, doi: 10.30813.

P. Octavia, “Implementasi Algoritma Naïve Bayes Dalam Klasifikasi Produk – Produk Terlaris,” JSAI J. Sci. Appl. Informatics, vol. 7, no. 2, pp. 411–414, 2024, doi: 10.36085.

M. D. Islamanda and Y. Sibaroni, “Whoosh User Sentiment Analysis on Social Media Using Word2Vec and the Best Naïve Bayes Probability Model,” J. dan Penelit. Tek. Inform., vol. 8, no. 3, pp. 1558–1568, 2024, doi: 10.33395.

S. A. S. Mola, Iqbal Muhammad Iskandar, J. E. Pidu Dimu, and W. Y. Seran, “Analisis Sentimen Pembangunan Ibu Kota Negara Indonesia Menggunakan Metode Naïve Bayes, Dan K-Nearest Neighbor,” HOAQ (High Educ. Organ. Arch. Qual. J. Teknol. Inf., vol. 15, no. 2, pp. 151–157, 2024, doi: 10.52972/hoaq.vol15no2.p151-157.

D. Alwan and M. A. Ridla, “Averaged Word2vec sebagai Ekstraksi Fitur pada Analisis Sentimen Ulasan Film di IMDb menggunakan Artificial Neural Network ( ANN ),” J. Sist. dan Teknol. Inf. Indones., vol. 9, no. 1, pp. 36–45, 2024, doi: doi.org/10.32528/justindo.v9i1.1204.

A. Firizkiansah, A. Muhammad, and I. R. Maulana, “Optimasi Klasifikasi Data Teks Menggunakan Algoritma Logistic Regression dengan TF-IDF dan SMOTE,” JIKOMTI J. Ilm. Ilmu Komput. dan Teknol. Inf., vol. 2, no. 1, pp. 29–36, 2025, [Online]. Available: https://ojs.sains.ac.id/index.php/Jikomti/article/view/97/119

Ardiansyah and Kurniawan, “Optimasi Metode Naïve Bayes Classifier Menggunakan Pendekatan Term Frequency-Inverse Document Frequency ( TF-IDF ) Pada Analisis Sentimen,” J. Sci. Appl. Informatics, vol. 7, no. 3, pp. 458–464, 2024, doi: 10.36085.

D. N. Febianty and M. Rahardi, “A Sentiment Analysis of Public Perception Toward Pets in Public Spaces Using Logistic Regression and Word Embedding,” J. Appl. Informatics Comput., vol. 9, no. 4, pp. 1846–1851, 2025, doi: 10.30871/jaic.v9i4.10245.

H. Firda et al., “Perbandingan Pelabelan Rating-based dan Inset Lexicon-based dalam Analisis Sentimen Menggunakan SVM (Studi Kasus: Ulasan Aplikasi GoBiz di Google Play Store),” Sist. J. Sist. Inf., vol. 14, pp. 516–528, 2025, doi: https://doi.org/10.32520/stmsi.v14i2.4795.

M. R. Virgiansyah, S. Stephanie, and M. Rizky Pribadi, “Analisis Sentimen terhadap Jalan Rusak di Palembang Pada Media Sosial Menggunakan Algoritma Naïve Bayes,” TIN Terap. Inform. Nusant., vol. 5, no. 1, pp. 1–10, 2024, doi: 10.47065/tin.v5i1.5239.

S. Shevira, I. M. A. D. Suarjaya, and P. W. Buana, “Pengaruh Kombinasi dan Urutan Pre-Processing pada Tweets Bahasa Indonesia,” JITTER J. Ilm. Teknol. dan Komput., vol. 3, no. 2, p. 1074, 2022, doi: 10.24843/jtrti.2022.v03.i02.p06.

R. Rahman Salam, M. Fajri Jamil, and Y. Ibrahim, “Sentiment Analysis of Cash Direct Assistance Distribution for Fuel Oil Using Support Vector Machine Analisis Sentimen Terhadap Bantuan Langsung Tunai (BLT) Bahan Bakar Minyak (BBM) Menggunakan Support Vector Machine,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. April, pp. 27–35, 2023.

M. R. Syafapri, E. Haerani, I. Iskandar, and L. Afriyanti, “Sentiment classification of interfaith marriage ban using Naive Bayes Classifier method,” J. Comput. Sci. Inf. Technol. ( CoSciTech ), vol. 5, no. 1, pp. 10–18, 2024, doi: 10.37859.

A. Putri, C. S. Hardiana, E. Novfuja, F. Try, and P. Siregar, “Komparasi Algoritma K-NN, Naive Bayes dan SVM untuk Prediksi Kelulusan Mahasiswa Tingkat Akhir,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. April, pp. 20–26, 2023, doi: 10.57152.

F. A. Ryandi, D. Pratiwi, S. Sari, and J. Sains, “Analisis Sentimen Masyarakat Di Media Sosial X Terhadap Kemenkes Dengan Naive Bayes dan SVM,” J. Sains dan Teknol., vol. 7, no. 1, pp. 1–6, 2025.

R. Kusumastuti, D. Oktafiani, and Y. Astica, “Optimasi Algoritma Stemming Porter untuk Pemrosesan Teks dalam Bahasa Indonesia,” vol. 6, no. 1, pp. 42–52, 2025, [Online]. Available: https://jifsi.unisti.ac.id/index.php/JIFSI%0D

Maulidya Prastita Syah, Ajeng Puspa Wardani, Mohammad Idhom, and Trimono, “Perbandingan Representasi Teks Tf-Idf Dan Bert Terhadap Akurasi Cosine Similarity Dalam Penilaian Otomatis Jawaban Berbasis Teks,” Data Sci. Indones., vol. 5, no. 1, pp. 47–59, 2025, doi: 10.47709/dsi.v5i1.6021.

M. Permatasari and N. N. Pusparini, “Analisis Kinerja Sistem Informasi Pengiriman Surat Dengan Pendekatan UML Pada Perusahaan Ekspedisi memiliki dampak yang signifikan terhadap produktivitas dan efektivitas pengelolaan proses sebuah perusahaan ekspedisi dengan menggunakan pendekatan UML,” Switch J. Sains dan Teknol. Inf., vol. 2, no. 6, 2024, doi: 10.62951.

Q. Hasanah, H. Oktavianto2, and Y. D. Rahayu, “Analisis Algoritma Gaussian Naive Bayes Terhadap Klasifikasi Data Pasien Penderita Gagal Jantung Gaussian Naive Bayes Algorithm Analysis Of Data Classification Of Heart Failure Patiens Jurnal Smart Teknologi,” J. Smart Teknol., vol. 3, no. 4, pp. 382–389, 2022, [Online]. Available: https://jurnal.unmuhjember.ac.id/index.php/JST/article/view/7597/3890

P. Bintoro, T. H. Andika, A. F. Yulia, and P. Widiandana, “Analisis Sentimen di Twitter Menggunakan Pendekatan Machine Learning Sentiment Analysis on Twitter Using Machine Learning Approach,” vol. 1, no. 1, pp. 33–39, 2023, [Online]. Available: https://share.google/WZmGhqKiyF7flTXOJ

D. E. Cahyani and I. Patasik, “Performance comparison of TF-IDF and Word2Vec models for emotion text classification,” Bull. Electr. Eng. Informatics, vol. 10, no. 5, pp. 2780–2788, 2021, doi: 10.11591/eei.v10i5.3157.

Z. Zhan, “Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis : A Case of Food Reviews,” ITM Web Conf., vol. 02013, 2025, doi: doi.org/10.1051/itmconf/20257002013.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Komparasi Ekstraksi Fitur TF-IDF dan Word2Vec pada Naïve Bayes untuk analisis Sentimen Pembangunan IKN di YouTube

Dimensions Badge
Article History
Submitted: 2026-01-13
Published: 2026-01-31
Abstract View: 187 times
PDF Download: 287 times
How to Cite
Rahmad Fahrozi, M. A., Siswa, T., & Verdikha, N. (2026). Komparasi Ekstraksi Fitur TF-IDF dan Word2Vec pada Naïve Bayes untuk analisis Sentimen Pembangunan IKN di YouTube. Journal of Information System Research (JOSH), 7(2), 519-527. https://doi.org/10.47065/josh.v7i2.9198
Section
Articles