Komparasi Metode Naïve Bayes, Random Forest dan KNN untuk Analisis Sentimen Penambangan Nikel


  • Beta Agus Setiyana Universitas Teknokrat Indonesia, Bandar Lampung, Indonesia
  • Ryan Randy Suryono * Mail Universitas Teknokrat Indonesia, Bandar Lampung, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; Raja Ampat; Save Raja Ampat; Naive Bayes; Random Forest; K-Nearest Neighbors; SMOTE; TF-IDF

Abstract

The phenomenon of increasing natural resource exploitation in Indonesia’s conservation areas has raised significant public concern, one of which involves the planned nickel mining project in Raja Ampat, a region renowned for its extraordinary marine biodiversity. This plan has sparked debates between economic interests, environmental preservation, and the sociocultural values of local communities. Amid the growing public discourse, social media has become a major platform for people to express their opinions, support, or opposition toward mining activities. This study aims to map public sentiment regarding the nickel mining issue in Raja Ampat by analyzing 5,556 Indonesian-language tweets collected from the social media platform X using the keyword “save raja ampat” between January- June 2025. The data underwent several preprocessing stages, including cleaning, case folding, tokenizing, stopword removal, and normalization, and were then represented using the TF-IDF method. Sentiment labeling was performed semi automatically using a lexicon based approach into three categories: positive, neutral, and negative. The sentiment distribution showed dominance of neutral (72.9%), followed by negative (24.3%) and positive (2.8%), indicating class imbalance. To address this issue, the SMOTE technique was applied to the training data. Three classical algorithms K-Nearest Neighbor (KNN), Complement Naïve Bayes (CNB), and Random Forest (RF) were compared using cross-validation and holdout testing with accuracy, precision, recall, and F1-score as evaluation metrics. The results show that CNB performed most stably before SMOTE, while after SMOTE, KNN demonstrated significant improvement, especially in recall and macro F1-score. These findings confirm that the combination of data balancing techniques and classical algorithms remains relevant and efficient as a methodological baseline for public sentiment analysis on complex environmental issues such as nickel mining in Raja Ampat.

Downloads

Download data is not yet available.

References

M. Raees and S. Fazilat, “Lexicon-Based Sentiment Analysis on Text Polarities with Evaluation of Classification Models,” arxiv, vol. 24, no. 1, pp. 1–18, 2024, doi: https://doi.org/10.48550/arXiv.2409.12840.

R. Srivastava, P. K. Bharti, and P. Verma, “Comparative Analysis of Lexicon and Machine Learning Approach for Sentiment Analysis,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 3, pp. 71–77, 2022, doi: 10.14569/IJACSA.2022.0130312.

J. C. Lapendy, A. A. C. Resky, A. Tenriola, D. F. Surianto, and U. S. Sidin, “Optimizing Sentiment Analysis of Electric Vehicles Through Oversampling Techniques on YouTube Comments,” J. Nas. Pendidik. Tek. Inform., vol. 14, no. 1, pp. 169–182, 2025, doi: 10.23887/janapati.v14i1.88205.

L. A. Susanto, “Komparasi Model Support Vector Machine Dan K-Nearest Neighbor Pada Analisis Sentimen Aplikasi Polri Super App,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 2, 2024, doi: 10.23960/jitet.v12i2.4152.

A. P. Naufal, P. Dita, and Y. F. Riska, “Sentiment analysis on public opinion of electric vehicles usage in Indonesia using support vector machine algorithms Global Greenhouse Gas ( GHG ) Emissions by Sector,” Tek. J. Sains Dan Teknol., vol. 19, no. 02, pp. 152–160, 2023, doi: http://dx.doi.org/10.36055/tjst.v19i2.21967.

D. Indriani, A. H. Nasution, W. Monika, and S. Nasution, “Towards a Sentiment Analyser for Low-resource Languages,” Lect. Notes Networks Syst., vol. 149, no. 1, pp. 109–118, 2021, doi: 10.1007/978-981-15-7990-5_10.

C. G. Özmen and S. Gündüz, “Comparison of Machine Learning Models for Sentiment Analysis of Big Turkish Web-Based Data †,” Appl. Sci., vol. 15, no. 5, pp. 1–20, 2025, doi: 10.3390/app15052297.

H. Zou and K. Xiang, “Sentiment Classification Method Based on Blending of Emoticons and Short Texts,” Entropy, vol. 24, no. 3, 2022, doi: 10.3390/e24030398.

S. J. Mahajani, S. Srivastava, and A. F. Smeaton, “A Comparison of Lexicon-Based and ML-Based Sentiment Analysis: Are There Outlier Words?,” 2023 31st Irish Conf. Artif. Intell. Cogn. Sci. AICS 2023, vol. 31, no. 1, 2023, doi: 10.1109/AICS60730.2023.10470734.

M. Das, S. Kamalanathan, and P. Alphonse, “A Comparative Study on TF-IDF feature weighting method and its analysis using unstructured dataset,” CEUR Workshop Proc., vol. 2870, no. 2, pp. 98–107, 2021, doi: DOI:10.48550/arXiv.2308.04037.

F. Dwianasari, R. D. Yani, K. N. Laksono, N. Mujaliza, and R. Fahlapi, “Analisis Sentimen Masyarakat terhadap Aktivitas Pertambangan di Raja Ampat Menggunakan Support Vector Machine dan Naïve Bayes dengan Teknik SMOTE klasifikasi menggunakan algoritma Support Vector Machine dan Naïve Bayes , serta evaluasi performa model pada,” Kaji. Ekon. dan Akunt. Terap., vol. 2, no. 2, pp. 234–244, 2025, doi: doi.org/10.61132/keat.v2i2.1208.

N. Norlaila, W. W. Winarno, and E. T. Luthfi, “Analisis Sentimen Masyarakat Tentang Tambang Di Indonesia Pada Twitter Menggunakan Data Mining,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 3, pp. 1091–1099, 2024, doi: 10.29100/jipi.v9i3.5402.

R. A. Hasibuan, D. E. Ratnawati, R. S. Perdana, U. Brawijaya, and P. Korespondensi, “Analisis Sentimen Kebijakan Ekspor Pasir Laut Pada Sosial Media Twitter Menggunakan Algoritma Support Sentiment Analysis of Sea Sand Export Policy on Twitter,” J. Sist. Informasi, Teknol. Informasi, dan Edukasi Sist. Inf., vol. 5, no. 1, pp. 24–33, 2024, doi: doi.org/10.25126/justsi.v5i1.373.

B. W. Rauf, “Sentimen Analisis Pertambangan Di Konawe Utara Dengan Metode Naïve Bayes,” Pros. Semin. Nas. Pemanfaat. SAINS DAN Teknol. Inf., vol. 1, no. 1, pp. 97–102, 2023, [Online]. Available: https://epublikasi.digitallinnovation.com/index.php/sempatin/article/view/98

O. Bellar, A. Baina, and M. Ballafkih, “Sentiment Analysis: Predicting Product Reviews for E-Commerce Recommendations Using Deep Learning and Transformers,” Mathematics, vol. 12, no. 15, 2024, doi: 10.3390/math12152403.

F. Suandi et al., Enhancing Sentiment Analysis Performance Using SMOTE and Majority Voting in Machine Learning Algorithms, no. Icae 2024. Atlantis Press International BV, 2024. doi: 10.2991/978-94-6463-620-8_10.

K. Machova, M. Mach, and M. Vasilko, “Comparison of machine learning and sentiment analysis in detection of suspicious online reviewers on different type of data,” Sensors, vol. 22, no. 1, 2022, doi: 10.3390/s22010155.

J. Muliawan and E. Dazki, “Analisis Sentimen Pemindahan Ibu Kota Negara Indonesia Menggunakan Tiga Algoritma: Naive Bayes, KNN, dan Random Forest,” J. Tek. Inform., vol. 4, no. 5, pp. 1227–1236, 2023, doi: https://doi.org/10.52436/1.jutif.2023.4.5.347.

F. K. Basri, “Analisis Sentimen Masyarakat terhadap Kebocoran Pusat Data Nasional Menggunakan Machine Learning,” J. Informatics Technol. Sci., vol. 7, no. 2, pp. 960–971, 2025, doi: 10.47065/bits.v7i2.7473.

S. Y. Afrianto, “Performance Analysis of IndoBERT for Sentiment Classification in Indonesian Hotel Review Data,” J. Inf. Syst. Res., vol. 6, no. 2, pp. 976–986, 2025, doi: 10.47065/josh.v6i2.6505.

M. Andani, J. Triloka, S. Y. Irianto, and H. W. Nugroho, “Comparison of K-Nearest Neighbor, Naive Bayes, Random Forest Algorithms for Obesity Prediction,” Sinkron, vol. 9, no. 1, pp. 502–510, 2025, doi: 10.33395/sinkron.v9i1.14478.

N. P. L. Santoso et al., “Transformation of Indonesian Language in Social Media Using AI Expert Systems and Machine Learning,” Int. Trans. Artif. Intell., vol. 3, no. 2, pp. 130–139, 2025, doi: 10.33050/italic.v3i2.806.

R. Herdian Saputra, “Perbandingan Algoritma SVM , Random Forest , dan Naive Bayes Terhadap Kasus Scam di Media Sosial Twitter,” J. Informatics Technol. Sci., vol. 7, no. 2, pp. 907–919, 2025, doi: 10.47065/bits.v7i2.7236.

A. Hadi, M. Qamal, and Y. Afrillia, “Comparison of Random Forest Algorithm Classifier and Naïve Bayes Algorithm in Whatsapp Message Type Classification,” J. Renew. Energy, Electr. Comput. Eng., vol. 5, no. 1, pp. 9–17, 2025, doi: 10.29103/jreece.v5i1.21227.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Komparasi Metode Naïve Bayes, Random Forest dan KNN untuk Analisis Sentimen Penambangan Nikel

Dimensions Badge
Article History
Submitted: 2025-09-01
Published: 2025-09-30
Abstract View: 270 times
PDF Download: 175 times
How to Cite
Setiyana, B., & Suryono, R. (2025). Komparasi Metode Naïve Bayes, Random Forest dan KNN untuk Analisis Sentimen Penambangan Nikel. Building of Informatics, Technology and Science (BITS), 7(2), 1443-1455. https://doi.org/10.47065/bits.v7i2.8263
Section
Articles

Most read articles by the same author(s)