Perbandingan Kinerja XGBoost dan Naive Bayes dalam Analisis Sentimen Komentar TikTok Terhadap Ibu Kota Nusantara (IKN) pada Data Tidak Seimbang

Novi Purnamasari; Nirwana Hendrastuty

doi:10.47065/bits.v7i4.9488

Novi Purnamasari Universitas Teknokrat Indonesia, Bandar Lampung, Indonesia
Nirwana Hendrastuty * Universitas Teknokrat Indonesia, Bandar Lampung, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/bits.v7i4.9488

Keywords: IKN; Naive Bayes; Sentiment Analysis; TF-IDF; TikTok; Word Cloud; XGBoost

Abstract

The growth of social media has generated diverse public responses regarding the development of Indonesia’s new capital city, Ibu Kota Nusantara (IKN), particularly on TikTok, a platform with high user interaction. This study aims to compare the performance of Naive Bayes and eXtreme Gradient Boosting (XGBoost) algorithms in sentiment analysis of TikTok comments related to IKN development under imbalanced data conditions. The dataset consists of 1,132 comments that underwent preprocessing, including case folding, text cleaning, tokenization, normalization, and stemming. Feature extraction was performed using the Term Frequency–Inverse Document Frequency (TF-IDF) method, generating 1,926 features to represent word importance. The classification process used an 80:20 split for training and testing data. The results show that Naive Bayes achieved an accuracy of 61.23%, while XGBoost obtained a slightly higher accuracy of 62.11%. XGBoost improved recall in the negative class (from 0.21 to 0.40) and neutral class (from 0.11 to 0.26), although the improvement remains limited. The difference in accuracy between the models is relatively small and does not indicate a significant overall performance improvement. This study is limited by the relatively small dataset size and imbalanced class distribution, which may affect data representativeness and model generalization. Therefore, the results are not yet optimal for broader real-world applications.

Downloads

Download data is not yet available.

References

S. M. Anugerah, R. Wijaya, and M. A. Bijaksana, “Sentimen Analysis Social Media for Disaster using Naïve Bayes and IndoBERT,” INTEK J. Penelit., vol. 11, no. 1, pp. 51–58, 2024, doi: 10.31963/intek.v11i1.4771.

Pius Deski Manalu, Mutiara Simanjuntak, and Chairil Umri, “Implementasi Algoritma Klasifikasi untuk Analisis Sentimen Media Sosial Tiktok Tahun 2025,” J. Tek. Inform. dan Teknol. Inf., vol. 5, no. 1, pp. 488–504, 2025, doi: 10.55606/jutiti.v5i2.5644.

J. Homepage, A. T. Chidayat, F. Shiddieq, and D. Nurhayati, “Analisis Sentimen Publik di platform X Pasca Skandal Bahan Bakar Minyak Oplosan Menggunakan Algoritma Naïve Bayes: Public Sentiment analysis on platform X,” Journal.Irpi.or.Id, vol. 5, no. October, pp. 1220–1230, 2025, [Online]. Available: https://www.journal.irpi.or.id/index.php/malcom/article/view/2220

A. S. Muliana, D. Lestarini, and S. P. Raflesia, “Analysis of Public Sentiment on Election Results using Naïve Bayes in Social Media X,” Sistemasi, vol. 13, no. 6, p. 2467, 2024, doi: 10.32520/stmsi.v13i6.4592.

R. N. Mauliza and Y. R. Sipayung, “Penerapan Text mining Dalam Menganalisis Pendapat Masyarakat Terhadap Pemilu 2024 Pada Media Sosial X Menggunakan Metode Naive Bayes,” Technomedia J., vol. 9, no. 1, pp. 1–16, 2024, doi: 10.33050/tmj.v9i1.2212.

A. Hermawan, I. Jowensen, J. Junaedi, and Edy, “Implementasi Text-Mining untuk Analisis Sentimen pada Twitter dengan Algoritma Support Vector Machine,” JST (Jurnal Sains dan Teknol., vol. 12, no. 1, pp. 129–137, 2023, doi: 10.23887/jstundiksha.v12i1.52358.

A. Halim, F. Zidan, I. Handayani, and A. Anggara, “Sentiment analysis of the 2024 election using the naïve bayes method using data x,” Jurnal Mandiri IT, vol. 14, no. 2, pp. 225–234, 2025, doi: 10.35335/mandiri.v14i2.471

L. H. Sarumpaet and R. R. Suryono, “Analisis Sentimen Publik Program PPPK di Media Sosial X menggunakan Naïve Bayes dan SVM,” Edumatic J. Pendidik. Inform., vol. 9, no. 2, pp. 362–371, 2025, doi: 10.29408/edumatic.v9i2.30065.

G. S. Rasyad and W. Maharani, “Logistic Regression and Naïve Bayes Comparison in Classifying Emotions on Indonesian X Social Media,” Edumatic J. Pendidik. Inform., vol. 9, no. 1, pp. 31–40, 2025, doi: 10.29408/edumatic.v9i1.29120.

R. Wati, S. Ernawati, and H. Rachmi, “Pembobotan TF-IDF Menggunakan Naïve Bayes pada Sentimen Masyarakat Mengenai Isu Kenaikan BIPIH,” J. Manaj. Inform., vol. 13, no. 1, pp. 84–93, 2023, doi: 10.34010/jamika.v13i1.9424.

Y. H. Agustin, N. Cici Mulyani, and W. Sindu Prasetya, “Analisis Sentimen Opini Publik Menggunakan Algoritma Naive Bayes dan TF-IDF,” J. Algoritm., vol. 22, no. 2, pp. 1373–1384, 2025, doi: 10.33364/algoritma/v.22-2.2671.

Rianggi and N. Ruhyana, “Analysis of Public Sentiment Towards 2024 Presidential Candidacy Using Naïve Bayes Algorithm,” J. Ris. Inform., vol. 7, no. 1, pp. 21–30, 2024, doi: 10.34288/jri.v7i1.356.

I. G. B. A. Budaya and I. K. P. Suniantara, “Comparison of Sentiment analysis Algorithms with SMOTE Oversampling and TF-IDF Implementation on Google Reviews for Public Health Centers,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 4, no. 3, pp. 1077–1086, 2024, doi: 10.57152/malcom.v4i3.1459.

I. A. Hidayah, R. Kusumawati, Z. Abidin, and M. Imamuddin, “Analysis of Public Sentiment Towards the TikTok Application Using the Naive Bayes Algorithm and Support Vector Machine,” Journal of Computer Networks, Architecture and High Performance Computing, vol. 6, no. 2, pp. 881–891, 2024

M. F. Ramadhan, F. Panjaitan, and H. Oktafiandi, “Analisis Sentimen Kutipan Media Sosial Berbahasa Indonesia Menggunakan Convolutional Neural Network,” Jurnal Komputer, Informasi dan Teknologi, vol. 5, no. 2, pp. 1-17, 2025, doi: 10.53697/jkomitek.v6i1.3627

R. S. Putra, W. Agustin, M. K. Anam, L. Lusiana, and S. Yaakub, “The Application of Naive Bayes Classifier Based Feature Selection on Analysis of Online Learning Sentiment in Online Media,” J. Transform., vol. 20, no. 1, pp. 44–56, 2022, doi: 10.26623/transformatika.v20i1.5144.

F. Riza, D. F. Hendrakusuma, B. Wibowo, D. Y. Al Afghani, and A. Abdurrahman, “Perbandingan Kinerja Algoritma Klasifikasi Machine Learning dalam Analisis Sentimen Ulasan Mobile Banking WONDR BY BNI,” INTECOMS J. Inf. Technol. Comput. Sci., vol. 8, no. 2, pp. 425–436, 2025, doi: 10.31539/intecoms.v8i2.14826.

K. N. Asniyah, A. Prahutama, and I. T. Utami, “Sentiment Analysis with TF-IDF Weighting Based on N-Gram for Support Vector Machine Model (Case Study: The 2024 General Election in Indonesia),” International Journal of Research Publication and Reviews, no. 6, pp. 4049–4055, 2025.

M. R. Manoppo et al., “Analisis Sentimen Publik Di Media Sosial Terhadap Kenaikan Ppn 12% Di Indonesia Menggunakan IndoBERT,” J. Kecerdasan Buatan dan Teknol. Inf., vol. 4, no. 2, pp. 152–163, 2025, doi: 10.69916/jkbti.v4i2.322.

N. Hendrastuty, S. Setiawansyah, M. G. An’ars, F. A. Rahmadianti, V. H. Saputra, and M. Rahman, “G2M weighting: a new approach based on multi-objective assessment data (case study of MOORA method in determining supplier performance evaluation),” Indones. J. Electr. Eng. Comput. Sci., vol. 38, no. 1, p. 403, 2025, doi: 10.11591/ijeecs.v38.i1.pp403-416.

R. Ahuja, A. Chug, S. Kohli, S. Gupta, and P. Ahuja, “The impact of features extraction on the sentiment analysis,” Procedia Comput. Sci., vol. 152, pp. 341–348, 2019, doi: 10.1016/j.procs.2019.05.008.

K. Alemerien, A. Al-Ghareeb, and M. Z. Alksasbeh, “Sentiment analysis of Online Reviews: A Machine Learning–Based Approach with TF-IDF Vectorization,” J. Mob. Multimed., vol. 20, no. 5, pp. 1089–1116, 2024, doi: 10.13052/jmm1550-4646.2055.

L. E. M. Äyräväinen, J. Hinds, and B. I. Davidson, “Disambiguating sentiment annotation: A mixed methods investigation of annotator experience and impact of instructions on annotator agreement,” PLoS One, vol. 20, no. 12 December, pp. 1–47, 2025, doi: 10.1371/journal.pone.0336269.

P. Mæhlum et al., “It’s Difficult to Be Neutral: Human and LLM-Based Sentiment Annotation of Patient Comments,” in Proceedings of the 1st Workshop on Patient-Oriented Language Processing (CL4Health), 2024, pp. 8–19.

M. Kamruzzaman and G. Kim, “Efficient Sentiment Analysis: A Resource-Aware Evaluation of Feature Extraction Techniques, Ensembling, and Deep Learning Models,” Proceedings of the 11th International Workshop on Natural Language Processing for Social Media, November 2023, pp. 9-20, 2024, doi: 10.18653/v1/2023.socialnlp-1.2.

M. Sivakumar, S. Parthasarathy, and T. Padmapriya, “Trade-off between training and testing ratio in accuracy machine learning for medical image processing,” PeerJ Comput. Sci., vol. 10, pp. 1–17, 2024, doi: 10.7717/PEERJ-CS.2245.

A. Wijaya and W. Bismi, “Penerapan Algoritma Accuracy machine learning Dalam Mengklasifikasi Data Masa Studi di Indonesia Berdasarkan Jenis Kelamin,” J. Inf. Eng. Educ. Technol., vol. 8, no. 2, pp. 62–74, 2025, doi: 10.26740/jieet.v8n2.p62-74.

D. Yuliawati and M. Faeang Ogya Widi, “Application of Multinomial Naïve Bayes for Sentiment Classification on Bukalapak Reviews,” J. Appl. Informatics Comput., vol. 9, no. 6, pp. 3883–3891, 2025, doi: 10.30871/jaic.v9i6.11671.

M. Cherradi and A. El Haddadi, “Comparative Analysis of Machine Learning AlgorithmsAccuracy machine learning for Sentiment analysis in Film Reviews,” Acadlore Trans. AI Mach. Learn., vol. 3, no. 3, pp. 137–147, 2024, doi: 10.56578/ataiml030301.

Z. Rais, S. Muhammad Fahmuddin, Saida, and A. T. Utomo, “Implementation of Accuracymachine learning Algorithm with eXtreme Gradient Boosting (XGBoost) Method in Hypertension Level Classification,” J. Appl. Sci. Eng. Technol. Educ., vol. 7, no. 1, pp. 126–136, 2025, doi: 10.35877/454RI.asci4191.

K. M. Sujon, R. Hassan, K. Choi, and M. A. Samad, “Accuracy, Precision, Recall, F1-Score, or MCC? Empirical Evidence from Advanced Statistics, ML, and XAI for Evaluating Business Predictive Models,” Journal of Big Data, vol. 12, no. 1, 2025, doi: 10.1186/s40537-025-01313-4.

J. Kim and M. Kim, “From Comparison to Confidence: The Dove Self-Esteem Project and the Transformation of Beauty Perceptions on Social Media,” Behav. Sci. (Basel)., vol. 15, no. 4, pp. 1–24, 2025, doi: 10.3390/bs15040414.

R. Atenstaedt, “Word Cloud analysis of the BJGP,” Br. J. Gen. Pract., vol. 62, no. 596, p. 148, 2012, doi: 10.3399/bjgp12X630142.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Perbandingan Kinerja XGBoost dan Naive Bayes dalam Analisis Sentimen Komentar TikTok Terhadap Ibu Kota Nusantara (IKN) pada Data Tidak Seimbang

Perbandingan Kinerja XGBoost dan Naive Bayes dalam Analisis Sentimen Komentar TikTok Terhadap Ibu Kota Nusantara (IKN) pada Data Tidak Seimbang

Abstract

Downloads

References

Most read articles by the same author(s)