Peningkatan Kinerja Model Naïve Bayes untuk Analisis Sentimen Komentar Terkait “Sound Horeg” Menggunakan SMOTE dan Tuning Parameter


  • Mustafid Kaisalana Universitas Dian Nuswantoro, Semarang, Indonesia
  • Gustina Alfa Trisnapradika * Mail Universitas Dian Nuswantoro, Semarang, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; Naïve Bayes; Sound Horeg; Text Classification; SMOTE; Hyperparameter Tuning

Abstract

The phenomenon of “Sound Horeg” on online platforms has sparked diverse public sentiments, making sentiment analysis an essential tool for understanding public opinion. This study aims to classify user sentiments (positive/negative) related to “Sound Horeg” using the Naïve Bayes algorithm. The dataset used in this research exhibits significant class imbalance, with a predominance of negative sentiments. The methodology involves a series of text preprocessing stages, including case folding, tokenizing, normalization, lexicon-based sentiment labeling, stopword removal, stemming, and duplicate removal. The sentiment labeling process utilizes an Indonesian sentiment lexicon compiled from two sources lexicon_positif.csv and lexicon_negatif.csv containing predefined lists of words with positive and negative sentiment scores based on Indonesian public opinion lexicons. Subsequently, text features are extracted using the Term Frequency–Inverse Document Frequency (TF-IDF) method. To address data imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to the training data to balance the number of positive and negative samples. The Naïve Bayes model is then optimized using GridSearchCV to determine the best alpha value. Experimental results show that the unoptimized Naïve Bayes model achieved an accuracy of 73%, but struggled to classify minority classes (positive sentiments) due to data bias. After applying SMOTE and parameter tuning, the model’s performance improved significantly, demonstrating the effectiveness of these techniques in producing a more balanced and robust model. This study concludes that the Naïve Bayes algorithm, when optimized with SMOTE and hyperparameter tuning, is effective for Indonesian-language sentiment analysis, particularly on imbalanced datasets. Future work may include exploring other algorithms and employing broader sentiment lexicons and more complex linguistic features to further enhance model performance.

Downloads

Download data is not yet available.

References

I. H. Saputra, “Analisis Sound Horeg di Jawa Timur: Perspektif Hadis dan Implikasi Medis terhadap Kebisingan dan Etika Sosial,” J. QURAN HADITH Stud., vol. 14, no. 1, Art. no. 1, May 2025, doi: 10.15408/quhas.v14i1.42872.

F. R. Lail, “Persepsi masyarakat Tentang Fenomena Penggunaan Pengeras Suara dengan Volume Keras Terhadap Kenyamanan Berkehidupan Sosial (Studi Kasus: Desa Mergayu, Kecamatan Bandung, Kabupaten Tulungagung),” Indones. J. Soc. Stud., vol. 6, no. 1, pp. 51–55, Jul. 2023, doi: 10.26740/ijss.v6n1.p51-55.

M. M. Hossain, M. S. Hossain, M. F. Mridha, M. Safran, and S. Alfarhood, “Multi task opinion enhanced hybrid BERT model for mental health analysis,” Sci. Rep., vol. 15, no. 1, p. 3332, Jan. 2025, doi: 10.1038/s41598-025-86124-6.

C. Suratnoaji, N. Nurhadi, and I. Arianto, “Measurement of Public Opinion based on Social Media Big Data (Indonesia and Malaysia),” in Proceedings of the First International Conference on Literature Innovation in Chinese Language, LIONG 2021, 19-20 October 2021, Purwokerto, Indonesia, Purwokerto, Indonesia: EAI, 2022. doi: 10.4108/eai.19-10-2021.2316598.

J. P. Venugopal, A. A. V. Subramanian, G. Sundaram, M. Rivera, and P. Wheeler, “A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data,” Appl. Sci., vol. 14, no. 23, Art. no. 23, Jan. 2024, doi: 10.3390/app142311471.

S. Rahmawati, D. Anggraini, and R. Kurniawan, “Natural Language Processing For Automatic Sentiment Analysis In Social Media Data,” Int. J. Inf. Eng. Sci., vol. 1, no. 1, pp. 16–19, Feb. 2024, doi: 10.62951/ijies.v1i1.54.

Natasha and R. R. Suryono, “Sentiment Analysis of the Influence of the Korean Wave in Indonesia using the Naive Bayes Method and Support Vector Machine,” INOVTEK Polbeng - Seri Inform., vol. 10, no. 1, Art. no. 1, Mar. 2025, doi: 10.35314/85x4wd90.

A. Basir, “Analysis of Electronic Wallet User Sentiment on Twitter (x) Social Media Using the Naïve Bayes Classifier Algorithm,” J. Inform. J. Pengemb. IT, vol. 10, no. 1, Art. no. 1, Jan. 2025, doi: 10.30591/jpit.v10i1.8180.

H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP J. Adv. Signal Process., vol. 2021, no. 1, p. 30, Jun. 2021, doi: 10.1186/s13634-021-00742-6.

I. K. Dharmendra, I. M. A. Putra, and Y. P. Atmojo, “Evaluation of the Effectiveness of SMOTE and Random Under Sampling in Emotion Classification of Tweets,” Inform. Educ. Prof. J. Inform., vol. 9, no. 2, Dec. 2024, doi: 10.51211/itbi.v9i2.3183.

S. Gan, S. Shao, L. Chen, L. Yu, and L. Jiang, “Adapting Hidden Naive Bayes for Text Classification,” Mathematics, vol. 9, no. 19, p. 2378, Jan. 2021, doi: 10.3390/math9192378.

S. Anggina, N. Yudi Setiawan, and F. A. Bachtiar, “Analysis of Customer Reviews Using Multinomial Naïve Bayes Classifier with Lexicon-Based and TF-IDF at Formaggio Coffee and Resto,” Best Account. Inf. Syst. Inf. Technol. Bus. Enterp., vol. 7, no. 1, Sep. 2022, doi: 10.34010/aisthebest.v7i1.7072.

A. Sauddin, T. A. Nurman, N. Aeni, and S. R. Sudarta, “Klasifikasi Spam Sms Menggunakan Naïve Bayes Classifier Dan K-Nearest Neighbor,” J. MSA Mat. Dan Stat. Serta Apl., vol. 13, no. 1, pp. 101–109, Jun. 2025, doi: 10.24252/msa.v13i1.46192.

V. Agresia and R. R. Suryono, “Comparison of SVM, Naïve Bayes, and Logistic Regression Algorithms for Sentiment Analysis of Fraud and Bots in Purcashing Concert Ticket,” INOVTEK Polbeng - Seri Inform., vol. 10, no. 2, Art. no. 2, Jul. 2025, doi: 10.35314/npyfdh47.

F. Fahrani and J. Aryanto, “Sentiment Analysis of Public Opinion on the Palestinian-Israeli Conflict using Support Vector Machine and Naïve Bayes Algorithms,” J. Sci. Res. Educ. Technol. JSRET, vol. 3, no. 4, Art. no. 4, Dec. 2024, doi: 10.58526/jsret.v3i4.606.

J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature selection for text classification with Naïve Bayes,” Expert Syst. Appl., vol. 36, no. 3, Part 1, pp. 5432–5435, Apr. 2009, doi: 10.1016/j.eswa.2008.06.054.

H. Hairani, K. E. Saputro, and S. Fadli, “K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4.5, SVM, dan naive Bayes,” J. Teknol. Dan Sist. Komput., vol. 8, no. 2, pp. 89–93, Apr. 2020, doi: 10.14710/jtsiskom.8.2.2020.89-93.

H. P. Jelita, M. I. Sa’ad, and Wahyuni, “Penerapan Algoritma Naïve Bayes Dalam Analisis sentiment Masyarakat Terhadap STMIK Widya Cipta Dharma,” Bull. Inf. Technol. BIT, vol. 6, no. 2, pp. 148–160, Jun. 2025, doi: 10.47065/bit.v6i2.2029.

F. Kamalov, S. E. Choutri, and A. F. Atiya, “Analytical formulation of synthetic minority oversampling technique (SMOTE) for imbalanced learning,” Gulf J. Math., vol. 19, no. 1, Art. no. 1, Jan. 2025, doi: 10.56947/gjom.v19i1.2639.

J. Saputra, L. Maryani, Rahmaddeni, D. Wulandari, and W. Eka, “Analisis Performa Naive Bayes Dan Svm Terhadap Sentimen Teks Media Sosial Dengan Word2vec Dan Smote,” J. INSTEK Inform. Sains Dan Teknol., vol. 10, no. 1, pp. 143–155, May 2025, doi: 10.24252/instek.v10i1.54889.

N. Hayatin, G. I. Marthasari, and L. Nuarini, “Optimization of Sentiment Analysis for Indonesian Presidential Election using Naïve Bayes and Particle Swarm Optimization,” J. Online Inform., vol. 5, no. 1, pp. 81–88, Jul. 2020, doi: 10.15575/join.v5i1.558.

Y. I. Kurniawan et al., “Naive Bayes Classifier with SMOTE for Sentiment Analysis of Blibli App Reviews on The Google Play Store,” J. Penelit. Inov., vol. 5, no. 3, pp. 2675–2688, Sep. 2025, doi: 10.54082/jupin.1842.

J. A. Nurcahyo and T. B. Sasongko, “Hyperparameter Tuning Algoritma Supervised Learning untuk Klasifikasi Keluarga Penerima Bantuan Pangan Beras,” Indones. J. Comput. Sci., vol. 12, no. 3, 2023, doi: 10.33022/ijcs.v12i3.3254.

D. A. Novira and H. H. Puspytasari, “Tinjauan Yuridis Pertanggungjawaban Terhadap Ambang Batas Kebisingan Sound Horeg yang Menimbulkan Kerugian,” Indones. J. Contemp. Law, vol. 1, no. 3, pp. 1–15, Jul. 2025.

F. Koto and G. Y. Rahmaningtyas, “InSet Lexicon: Indonesia Sentiment Lexicon.” 2017. [Online]. Available: https://github.com/fajri91/InSet


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Peningkatan Kinerja Model Naïve Bayes untuk Analisis Sentimen Komentar Terkait “Sound Horeg” Menggunakan SMOTE dan Tuning Parameter

Dimensions Badge
Article History
Submitted: 2025-10-17
Published: 2025-12-08
Abstract View: 10 times
PDF Download: 1 times
How to Cite
Kaisalana, M., & Trisnapradika, G. (2025). Peningkatan Kinerja Model Naïve Bayes untuk Analisis Sentimen Komentar Terkait “Sound Horeg” Menggunakan SMOTE dan Tuning Parameter. Building of Informatics, Technology and Science (BITS), 7(3), 1500-1511. https://doi.org/10.47065/bits.v7i3.8554
Issue
Section
Articles