Peningkatan Kinerja Model Naïve Bayes untuk Analisis Sentimen Komentar Terkait “Sound Horeg” Menggunakan SMOTE dan Tuning Parameter
Abstract
The phenomenon of “Sound Horeg” on online platforms has sparked diverse public sentiments, making sentiment analysis an essential tool for understanding public opinion. This study aims to classify user sentiments (positive/negative) related to “Sound Horeg” using the Naïve Bayes algorithm. The dataset used in this research exhibits significant class imbalance, with a predominance of negative sentiments. The methodology involves a series of text preprocessing stages, including case folding, tokenizing, normalization, lexicon-based sentiment labeling, stopword removal, stemming, and duplicate removal. The sentiment labeling process utilizes an Indonesian sentiment lexicon compiled from two sources lexicon_positif.csv and lexicon_negatif.csv containing predefined lists of words with positive and negative sentiment scores based on Indonesian public opinion lexicons. Subsequently, text features are extracted using the Term Frequency–Inverse Document Frequency (TF-IDF) method. To address data imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to the training data to balance the number of positive and negative samples. The Naïve Bayes model is then optimized using GridSearchCV to determine the best alpha value. Experimental results show that the unoptimized Naïve Bayes model achieved an accuracy of 73%, but struggled to classify minority classes (positive sentiments) due to data bias. After applying SMOTE and parameter tuning, the model’s performance improved significantly, demonstrating the effectiveness of these techniques in producing a more balanced and robust model. This study concludes that the Naïve Bayes algorithm, when optimized with SMOTE and hyperparameter tuning, is effective for Indonesian-language sentiment analysis, particularly on imbalanced datasets. Future work may include exploring other algorithms and employing broader sentiment lexicons and more complex linguistic features to further enhance model performance.
Downloads
References
I. H. Saputra, “Analisis Sound Horeg di Jawa Timur: Perspektif Hadis dan Implikasi Medis terhadap Kebisingan dan Etika Sosial,” J. QURAN HADITH Stud., vol. 14, no. 1, Art. no. 1, May 2025, doi: 10.15408/quhas.v14i1.42872.
F. R. Lail, “Persepsi masyarakat Tentang Fenomena Penggunaan Pengeras Suara dengan Volume Keras Terhadap Kenyamanan Berkehidupan Sosial (Studi Kasus: Desa Mergayu, Kecamatan Bandung, Kabupaten Tulungagung),” Indones. J. Soc. Stud., vol. 6, no. 1, pp. 51–55, Jul. 2023, doi: 10.26740/ijss.v6n1.p51-55.
M. M. Hossain, M. S. Hossain, M. F. Mridha, M. Safran, and S. Alfarhood, “Multi task opinion enhanced hybrid BERT model for mental health analysis,” Sci. Rep., vol. 15, no. 1, p. 3332, Jan. 2025, doi: 10.1038/s41598-025-86124-6.
C. Suratnoaji, N. Nurhadi, and I. Arianto, “Measurement of Public Opinion based on Social Media Big Data (Indonesia and Malaysia),” in Proceedings of the First International Conference on Literature Innovation in Chinese Language, LIONG 2021, 19-20 October 2021, Purwokerto, Indonesia, Purwokerto, Indonesia: EAI, 2022. doi: 10.4108/eai.19-10-2021.2316598.
J. P. Venugopal, A. A. V. Subramanian, G. Sundaram, M. Rivera, and P. Wheeler, “A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data,” Appl. Sci., vol. 14, no. 23, Art. no. 23, Jan. 2024, doi: 10.3390/app142311471.
S. Rahmawati, D. Anggraini, and R. Kurniawan, “Natural Language Processing For Automatic Sentiment Analysis In Social Media Data,” Int. J. Inf. Eng. Sci., vol. 1, no. 1, pp. 16–19, Feb. 2024, doi: 10.62951/ijies.v1i1.54.
Natasha and R. R. Suryono, “Sentiment Analysis of the Influence of the Korean Wave in Indonesia using the Naive Bayes Method and Support Vector Machine,” INOVTEK Polbeng - Seri Inform., vol. 10, no. 1, Art. no. 1, Mar. 2025, doi: 10.35314/85x4wd90.
A. Basir, “Analysis of Electronic Wallet User Sentiment on Twitter (x) Social Media Using the Naïve Bayes Classifier Algorithm,” J. Inform. J. Pengemb. IT, vol. 10, no. 1, Art. no. 1, Jan. 2025, doi: 10.30591/jpit.v10i1.8180.
H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP J. Adv. Signal Process., vol. 2021, no. 1, p. 30, Jun. 2021, doi: 10.1186/s13634-021-00742-6.
I. K. Dharmendra, I. M. A. Putra, and Y. P. Atmojo, “Evaluation of the Effectiveness of SMOTE and Random Under Sampling in Emotion Classification of Tweets,” Inform. Educ. Prof. J. Inform., vol. 9, no. 2, Dec. 2024, doi: 10.51211/itbi.v9i2.3183.
S. Gan, S. Shao, L. Chen, L. Yu, and L. Jiang, “Adapting Hidden Naive Bayes for Text Classification,” Mathematics, vol. 9, no. 19, p. 2378, Jan. 2021, doi: 10.3390/math9192378.
S. Anggina, N. Yudi Setiawan, and F. A. Bachtiar, “Analysis of Customer Reviews Using Multinomial Naïve Bayes Classifier with Lexicon-Based and TF-IDF at Formaggio Coffee and Resto,” Best Account. Inf. Syst. Inf. Technol. Bus. Enterp., vol. 7, no. 1, Sep. 2022, doi: 10.34010/aisthebest.v7i1.7072.
A. Sauddin, T. A. Nurman, N. Aeni, and S. R. Sudarta, “Klasifikasi Spam Sms Menggunakan Naïve Bayes Classifier Dan K-Nearest Neighbor,” J. MSA Mat. Dan Stat. Serta Apl., vol. 13, no. 1, pp. 101–109, Jun. 2025, doi: 10.24252/msa.v13i1.46192.
V. Agresia and R. R. Suryono, “Comparison of SVM, Naïve Bayes, and Logistic Regression Algorithms for Sentiment Analysis of Fraud and Bots in Purcashing Concert Ticket,” INOVTEK Polbeng - Seri Inform., vol. 10, no. 2, Art. no. 2, Jul. 2025, doi: 10.35314/npyfdh47.
F. Fahrani and J. Aryanto, “Sentiment Analysis of Public Opinion on the Palestinian-Israeli Conflict using Support Vector Machine and Naïve Bayes Algorithms,” J. Sci. Res. Educ. Technol. JSRET, vol. 3, no. 4, Art. no. 4, Dec. 2024, doi: 10.58526/jsret.v3i4.606.
J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature selection for text classification with Naïve Bayes,” Expert Syst. Appl., vol. 36, no. 3, Part 1, pp. 5432–5435, Apr. 2009, doi: 10.1016/j.eswa.2008.06.054.
H. Hairani, K. E. Saputro, and S. Fadli, “K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4.5, SVM, dan naive Bayes,” J. Teknol. Dan Sist. Komput., vol. 8, no. 2, pp. 89–93, Apr. 2020, doi: 10.14710/jtsiskom.8.2.2020.89-93.
H. P. Jelita, M. I. Sa’ad, and Wahyuni, “Penerapan Algoritma Naïve Bayes Dalam Analisis sentiment Masyarakat Terhadap STMIK Widya Cipta Dharma,” Bull. Inf. Technol. BIT, vol. 6, no. 2, pp. 148–160, Jun. 2025, doi: 10.47065/bit.v6i2.2029.
F. Kamalov, S. E. Choutri, and A. F. Atiya, “Analytical formulation of synthetic minority oversampling technique (SMOTE) for imbalanced learning,” Gulf J. Math., vol. 19, no. 1, Art. no. 1, Jan. 2025, doi: 10.56947/gjom.v19i1.2639.
J. Saputra, L. Maryani, Rahmaddeni, D. Wulandari, and W. Eka, “Analisis Performa Naive Bayes Dan Svm Terhadap Sentimen Teks Media Sosial Dengan Word2vec Dan Smote,” J. INSTEK Inform. Sains Dan Teknol., vol. 10, no. 1, pp. 143–155, May 2025, doi: 10.24252/instek.v10i1.54889.
N. Hayatin, G. I. Marthasari, and L. Nuarini, “Optimization of Sentiment Analysis for Indonesian Presidential Election using Naïve Bayes and Particle Swarm Optimization,” J. Online Inform., vol. 5, no. 1, pp. 81–88, Jul. 2020, doi: 10.15575/join.v5i1.558.
Y. I. Kurniawan et al., “Naive Bayes Classifier with SMOTE for Sentiment Analysis of Blibli App Reviews on The Google Play Store,” J. Penelit. Inov., vol. 5, no. 3, pp. 2675–2688, Sep. 2025, doi: 10.54082/jupin.1842.
J. A. Nurcahyo and T. B. Sasongko, “Hyperparameter Tuning Algoritma Supervised Learning untuk Klasifikasi Keluarga Penerima Bantuan Pangan Beras,” Indones. J. Comput. Sci., vol. 12, no. 3, 2023, doi: 10.33022/ijcs.v12i3.3254.
D. A. Novira and H. H. Puspytasari, “Tinjauan Yuridis Pertanggungjawaban Terhadap Ambang Batas Kebisingan Sound Horeg yang Menimbulkan Kerugian,” Indones. J. Contemp. Law, vol. 1, no. 3, pp. 1–15, Jul. 2025.
F. Koto and G. Y. Rahmaningtyas, “InSet Lexicon: Indonesia Sentiment Lexicon.” 2017. [Online]. Available: https://github.com/fajri91/InSet
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Peningkatan Kinerja Model Naïve Bayes untuk Analisis Sentimen Komentar Terkait “Sound Horeg” Menggunakan SMOTE dan Tuning Parameter
Pages: 1500-1511
Copyright (c) 2025 Mustafid Kaisalana, Gustina Alfa Trisnapradika

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















