Analisis Komparatif Kinerja Algoritma Support Vector Machine, Random Forest, dan Naive Bayes untuk Klasifikasi Sentimen pada Komentar YouTube


  • Eustachius Dito Dewantoro Universitas Dian Nuswantoro, Semarang, Indonesia
  • Sindhu Rakasiwi * Mail Universitas Dian Nuswantoro, Semarang, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; YouTube Comments; Machine Learning; Socio-Political Issues

Abstract

The rise of social media platforms like YouTube has made them a primary medium for public discourse on socio-political issues, such as the "August 25th protests," which triggered massive polarization in the digital space. The vast volume of comments necessitates a computational approach for sentiment analysis. This study aims to classify public sentiment into positive and negative categories while comparing the performance of Naive Bayes, Random Forest, and Support Vector Machine (SVM). These algorithms were selected for their computational efficiency on high-dimensional text data compared to Deep Learning models. The methodology involved collecting 2,917 comments via the YouTube Data API v3, followed by text preprocessing, lexicon-based automated labeling, and TF-IDF feature weighting. To address the dataset's imbalance, where negative sentiment dominated at 78.8%, stratified sampling was applied to maintain class proportions. Results indicate that SVM achieved the highest accuracy at 88.2%, outperforming Random Forest (83.1%) and Naive Bayes (81.2%). SVM's superiority stems from its ability to find an optimal hyperplane that maximizes class margins, ensuring stability in imbalanced datasets. This research contributes a robust classification framework for understanding public opinion dynamics on specific political issues in Indonesia.

Downloads

Download data is not yet available.

References

M. Dube, “When the Challenges of Widowhood Extend to Childcare: Essential Considerations for Social Work Practice,” Soc. Sci., vol. 11, no. 5, hlm. 225, Mei 2022, doi: 10.3390/socsci11050225.

R. Vindua dan A. U. Zailani, “Analisis Sentimen Pemilu Indonesia Tahun 2024 Dari Media Sosial Twitter Menggunakan Python,” JURIKOM (Jurnal Riset Komputer), vol. 10, no. 2, hlm. 479, Apr 2023, doi: 10.30865/jurikom.v10i2.5945.

* Tantiara, F. Sinaga, I. Gavrila, B. Sembiring, dan R. Rangkuti, “A Comparative Analysis of Buzzer and Non-Buzzer Comment on Prabowo’s Instagram Posts,” Print) Journal of English Language and Education, vol. 10, hlm. 2502–4132, 2025, doi: 10.31004/jele.v10i4.913.

M. Musfiroh, A. Tholib, dan Z. Arifin, “Analisis Sentimen Terhadap Ulasan Aplikasi Shopee di Google Play Store Menggunakan Metode TF-IDF dan Long Short-Term Memory),” Journal of Electrical Engineering and Computer (JEECOM), vol. 6, no. 2, hlm. 371–381, Okt 2024, doi: 10.33650/jeecom.v6i2.8713.

R. R. Harahap dan Mhd. Furqan, “Sentiment Analysis towards the 2024 Vice Presidential Candidate Debate Using the Support Vector Machine Algorithm,” sinkron, vol. 8, no. 3, hlm. 1783–1794, Jul 2024, doi: 10.33395/sinkron.v8i3.13903.

R. Wulan dan I. Hertanto, “IMPLEMENTASI ALGORITMA MULTINOMIAL NAÏVE BAYES UNTUK MENDETEKSI TWEET UJARAN KEBENCIAN BAHASA INDONESIA TERHADAP PSSI,” SKANIKA: Sistem Komputer dan Teknik Informatika, vol. 8, no. 1, hlm. 193–203, Jan 2025, doi: 10.36080/skanika.v8i1.3355.

L. H. Sarumpaet dan R. R. Suryono, “Analisis Sentimen Publik Program PPPK di Media Sosial X menggunakan Naïve Bayes dan SVM,” Edumatic: Jurnal Pendidikan Informatika, vol. 9, no. 2, hlm. 362–371, Agu 2025, doi: 10.29408/edumatic.v9i2.30065.

E. A. Salsabila, M. Pratama, P. Wahyuni, V. Purwaningrum, dan M. F. Aziz, “Analisis Sentimen Ujaran Kebencian Pada Kolom Komentar Di Instagram,” Journal of Integrated Innovation Science, vol. 1, no. 1, hlm. 1–10, Jun 2025, doi: 10.69693/jiis.v1i1.2.

P. Azami dan K. Passi, “Detecting Fake Accounts on Instagram Using Machine Learning and Hybrid Optimization Algorithms,” Algorithms, vol. 17, no. 10, Okt 2024, doi: 10.3390/a17100425.

M. Ghianza dan A. Ghifari, “HATE SPEECH CLASSIFICATION IN INDONESIAN SOCIAL MEDIA COMMENTS USING THE DECISION TREE ALGORITHM,” Jurnal Teknik Informatika (JUTIF), Jun 2025, doi: 10.13140/RG.2.2.35551.57762.

Pawit Widiyantoro, Paradise Paradise, dan Yogo Dwi Prasetyo, “Deteksi Cyberbullying pada Pemain Sepak Bola di Platform Media Sosial ‘X’ Menggunakan Metode Long Short-Term Memory (LSTM),” Repeater : Publikasi Teknik Informatika dan Jaringan, vol. 3, no. 1, hlm. 201–217, Jan 2025, doi: 10.62951/repeater.v3i1.382.

F. Fitroh dan F. Hudaya, “Systematic Literature Review: Analisis Sentimen Berbasis Deep Learning,” Jurnal Nasional Teknologi dan Sistem Informasi, vol. 9, no. 2, hlm. 132–140, Agu 2023, doi: 10.25077/TEKNOSI.v9i2.2023.132-140.

J. W. Iskandar dan Y. Nataliani, “Perbandingan Naïve Bayes, SVM, dan k-NN untuk Analisis Sentimen Gadget Berbasis Aspek,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 6, hlm. 1120–1126, Des 2021, doi: 10.29207/resti.v5i6.3588.

Chely Aulia Misrun, E. Haerani, M. Fikry, dan E. Budianita, “Analisis sentimen komentar youtube terhadap Anies Baswedan sebagai bakal calon presiden 2024 menggunakan metode naive bayes classifier,” Jurnal CoSciTech (Computer Science and Information Technology), vol. 4, no. 1, hlm. 207–215, Apr 2023, doi: 10.37859/coscitech.v4i1.4790.

R. E. Saudy, A. E. D. M. El-Ghazaly, E. S. Nasr, dan M. H. Gheith, “A Novel Hybrid Sentiment Analysis Classification Approach for Mobile Applications Arabic Slang Reviews,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 8, hlm. 435–443, 2022, doi: 10.14569/IJACSA.2022.0130849.

M. Mohd, S. Javeed, Nowsheena, M. A. Wani, dan H. A. Khanday, “Sentiment analysis using lexico-semantic features,” J. Inf. Sci., vol. 50, no. 6, hlm. 1449–1470, Des 2024, doi: 10.1177/01655515221124016.

R. H. Muhammadi, T. G. Laksana, dan A. B. Arifa, “Combination of Support Vector Machine and Lexicon-Based Algorithm in Twitter Sentiment Analysis,” Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika, vol. 8, no. 1, hlm. 59–71, Mar 2022, doi: 10.23917/khif.v8i1.15213.

V. W. D. Thomas dan F. Rumaisa, “Analisis Sentimen Ulasan Hotel Bahasa Indonesia Menggunakan Support Vector Machine dan TF-IDF,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 6, no. 3, hlm. 1767, Jul 2022, doi: 10.30865/mib.v6i3.4218.

E. Alpaydin, Introduction to Machine Learning, 4th ed. Cambridge, MA, USA: MIT press, 2020.

J. Han, M. Kamber, dan J. Pei, Data Mining: Concepts and Techniques, 4th ed. San Francisco, CA, USA: Morgan Kaufmann, 2022.

F. N. Rahman dan S. Lestari, “Analisis Sentimen Masyarakat Terhadap Pemerintah di Era Kabinet Joko Widodo Berdasarkan Sosial Media X Menggunakan Naïve bayes dan K-Nearest Neighbor (KNN),” INTECOMS: Journal of Information Technology and Computer Science, vol. 7, no. 5, hlm. 1537–1544, Sep 2024, doi: 10.31539/intecoms.v7i5.11823.

Ismail. B. Mustapha, S. Hasan, H. S. Y. Nabbus, M. M. A. Montaser, S. O. Olatunji, dan S. M. Shamsuddin, “Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 2, hlm. 726–731, 2023, doi: 10.14569/IJACSA.2023.0140286.

K. S. Kodoati dan K. D. Hartomo, “Evaluasi Keberhasilan F-Learn Menggunakan Human Organization Technology (HOT) Fit Model pada Universitas Kristen Satya Wacana,” JATISI (Jurnal Teknik Informatika dan Sistem Informasi), vol. 9, no. 3, hlm. 2096–2111, Sep 2022, doi: 10.35957/jatisi.v9i3.2201.

I. Jahan, M. N. Islam, M. M. Hasan, dan M. R. Siddiky, “Comparative analysis of machine learning algorithms for sentiment classification in social media text,” World Journal of Advanced Research and Reviews, vol. 23, no. 3, hlm. 2842–2852, 2024, doi: 10.30574/wjarr.2024.23.3.2983.

P. Vickers, M. Bohn, Y. He, dan R. McConville, “We Need to Talk About Classification Evaluation Metrics in NLP,” arXiv preprint arXiv:2401.03831 (Proceedings of AACL), 2024, doi: 10.48550/arXiv.2401.03831.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Analisis Komparatif Kinerja Algoritma Support Vector Machine, Random Forest, dan Naive Bayes untuk Klasifikasi Sentimen pada Komentar YouTube

Dimensions Badge
Article History
Submitted: 2025-12-18
Published: 2026-03-19
Abstract View: 61 times
PDF Download: 47 times
How to Cite
Dewantoro, E., & Rakasiwi, S. (2026). Analisis Komparatif Kinerja Algoritma Support Vector Machine, Random Forest, dan Naive Bayes untuk Klasifikasi Sentimen pada Komentar YouTube. Building of Informatics, Technology and Science (BITS), 7(4), 2479-2490. https://doi.org/10.47065/bits.v7i4.8959
Issue
Section
Articles