Analisis Model Klasifikasi Sentimen Publik Terhadap Kebijakan Keberlanjutan IKN Menggunakan BERT Sebagai Feature Extractor dan K-Nearest Neighbor (KNN)


  • Mohammad Hiqmal Fiqri * Mail Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
  • Rudiman Rudiman Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
  • Naufal Azmi Verdikha Universitas Muhammadiyah Kalimantan Timur, Samarinda, Indonesia
  • (*) Corresponding Author
Keywords: IndoBERT; K-Nearest Neighbor; Nusantara Capital City; YouTube; Classification

Abstract

This study aims to evaluate the performance of sentiment classification models for public opinions regarding the relocation of Indonesia’s new capital (IKN) using a combination of IndoBERT as a feature extractor and K-Nearest Neighbor (KNN) as a classifier. The dataset consisted of 1,274 YouTube comments related to IKN, which were annotated by an expert in sociology and text analysis. The preprocessing stage involved cleaning numbers, URLs, emojis, and punctuation, as well as removing stopwords using the Sastrawi library. IndoBERT produced 768-dimensional vector representations, which were then classified using KNN with k=5 and Euclidean distance. Evaluation with 5-fold cross validation achieved an accuracy of 73.31%. However, the recall for the positive class was relatively low (0.49), indicating challenges in detecting positive comments due to class imbalance (831 negative, 294 positive, 149 neutral). These findings suggest that the IndoBERT+KNN model performs well on majority classes but struggles with minority classes. The contribution of this research is to provide a critical analysis of the limitations of IndoBERT-based models in Indonesian sentiment classification and to recommend future directions, including data balancing and fine-tuning approaches.

Downloads

Download data is not yet available.

References

A. Wijayanto, “Analisis Sentimen Komentar Youtube Mengenai Vaksin Covid-19 Menggunakan Support Vector Machine,” J. PILAR Teknol. J. Ilm. Ilmu Ilmu Tek., vol. 7, no. 1, pp. 24–31, 2022, doi: 10.33319/piltek.v7i1.118.

Fransiscus and A. S. Girsang, “Sentiment Analysis of COVID-19 Public Activity Restriction (PPKM) Impact using BERT Method,” Int. J. Eng. Trends Technol., vol. 70, no. 12, pp. 281–288, 2022, doi: 10.14445/22315381/IJETT-V70I12P226.

D. N. Larasakti, A. Aziz, and D. Aditya, “Analisis Sentimen Komentar Video YouTube dengan Metode K-Nearest Neighbor,” J. Ilm. Wahana Pendidik., vol. 9, no. 5, pp. 132–142, 2023, doi: 10.5281/zenodo.7728573.

N. P. I. Maharani, A. Purwarianti, Y. Yustiawan, and F. C. Rochim, “Domain-Specific Language Model Post-Training for Indonesian Financial NLP,” Proc. Int. Conf. Electr. Eng. Informatics, 2023, doi: 10.1109/ICEEI59426.2023.10346625.

D. G. Mandhasiya, H. Murfi, and A. Bustamam, “The Hybrid of BERT and Deep Learning Models for Indonesian Sentiment Analysis.,” Indones. J. Electr. Eng. Comput. Sci., vol. 33, no. 1, pp. 591–602, 2024, doi: 10.11591/ijeecs.v33.i1.pp591-602.

M. Thomson, H. Murfi, and G. Ardaneswari, “BERT-Based Hybrid Deep Learning with Text Augmentation for Sentiment Analysis of Indonesian Hotel Reviews,” in Proceedings of the 12th International Conference on Data Science, Technology and Applications (DATA 2023), SCITEPRESS, 2023, pp. 468–473. doi: 10.5220/0012127400003541.

C. R. Tarumingkeng, Analisis Sentimen Menggunakan Text Mining, vol. 3, no. 1–2. Bogor, 2024. doi: 10.1016/0010-4655(90)90107-C.

N. Husin, “Komparasi Algoritma Random Forest, Naïve Bayes, dan BERT Untuk Multi-Class Classification Pada Artikel Cable News Network (CNN),” J. Esensi Infokom J. Esensi Sist. Inf. dan Sist. Komput., vol. 7, no. 1, pp. 75–84, 2023, doi: 10.55886/infokom.v7i1.608.

S. Putatunda, A. Bhowmik, G. Thiruvenkadam, and R. Ghosh, “A BERT-Based Ensemble Approach for Sentiment Classification of Customer Reviews and Its Application to Nudge Marketing in E-Commerce,” 2023. [Online]. Available: http://arxiv.org/abs/2311.10782

M. S. Sayeed, V. Mohan, and K. S. Muthu, “BERT: A Review of Applications in Sentiment Analysis,” HighTech Innov. J., vol. 4, no. 2, pp. 453–462, 2023, doi: 10.28991/HIJ-2023-04-02-015.

I. Nuttakwa, Rudiman, and F. Yulianto, “Maps Badan Penyelenggara Jaminan Sosial (BPJS) Kesehatan Samarinda Menggunakan Metode K-Nearest Neighbour (KNN),” J. Teknol. Inf., vol. 18, no. 2, pp. 100–116, 2024.

A. N. Azhar and L. M. Khodra, “Fine-Tuning Pretrained Multilingual BERT Model for Indonesian Aspect-Based Sentiment Analysis,” in Proceedings of the 11th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA 2024), IEEE, 2024.

R. Z. Suchrady and A. Purwarianti, “Indo LEGO-ABSA: A Multitask Generative Aspect Based Sentiment Analysis for Indonesian Language,” Proc. Int. Conf. Electr. Eng. Informatics, 2023, doi: 10.1109/ICEEI59426.2023.10346852.

B. Hakim, “Analisa Sentimen Data Text Preprocessing Pada Data Mining Dengan Menggunakan Machine Learning,” JBASE - J. Bus. Audit Inf. Syst., vol. 4, no. 2, pp. 16–22, 2021, doi: 10.30813/jbase.v4i2.3000.

J. R. Jim, M. A. R. Talukder, P. Malakar, M. M. Kabir, K. Nur, and M. F. Mridha, “Recent Advancements and Challenges of NLP-Based Sentiment Analysis: A State-of-the-Art Review,” Nat. Lang. Process. J., vol. 6, p. 100059, 2024, doi: 10.1016/j.nlp.2024.100059.

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” in COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, Association for Computational Linguistics (ACL), 2020, pp. 757–770. doi: 10.18653/v1/2020.coling-main.66.

A. Jazuli, Widowati, and R. Kusumaningrum, “Optimizing Aspect-Based Sentiment Analysis Using BERT for Comprehensive Analysis of Indonesian Student Feedback,” Appl. Sci., vol. 15, no. 1, pp. 1–28, 2025, doi: 10.3390/app15010172.

I. Daqiqil, H. Saputra, Syamsudhuha, R. Kurniawan, and Y. Andriyani, “Sentiment Analysis of Student Evaluation Feedback Using Transformer-Based Language Models,” Indones. J. Electr. Eng. Comput. Sci., vol. 36, no. 2, pp. 1127–1139, 2024, doi: 10.11591/ijeecs.v36.i2.pp1127-1139.

R. Zulcharnain, G. Abdurrahman, and D. Daryanto, “Analisis Sentimen Ulasan Duolingo dengan Metode Algoritma Multinomial Naive Bayes,” J. Inform. dan Teknol. Pendidik., vol. 5, no. 1, pp. 1–16, 2025, doi: 10.59395/jitp.v5i1.113.

M. D. A. R. Dzakwan and Subektiningsih, “Klasifikasi Tingkat Risiko Kesehatan Ibu Hamil Menggunakan Algoritma Support Vector Machine,” Indones. J. Comput. Sci., vol. 12, no. 5, pp. 2798–2807, 2023, doi: 10.33022/ijcs.v12i5.3372.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Analisis Model Klasifikasi Sentimen Publik Terhadap Kebijakan Keberlanjutan IKN Menggunakan BERT Sebagai Feature Extractor dan K-Nearest Neighbor (KNN)

Dimensions Badge
Article History
Submitted: 2025-08-09
Published: 2025-09-05
Abstract View: 400 times
PDF Download: 191 times
How to Cite
Fiqri, M., Rudiman, R., & Verdikha, N. (2025). Analisis Model Klasifikasi Sentimen Publik Terhadap Kebijakan Keberlanjutan IKN Menggunakan BERT Sebagai Feature Extractor dan K-Nearest Neighbor (KNN). Building of Informatics, Technology and Science (BITS), 7(2), 1332-1342. https://doi.org/10.47065/bits.v7i2.8168
Section
Articles