Deteksi Berita Hoax Berbahasa Indonesia Menggunakan Multilingual-BERT pada Media Berita Online


  • Syalma Tarissa * Mail Universitas Bina Insan, Lubuklinggau, Indonesia
  • Muhamad Akbar Universitas Bina Insan, Lubuklinggau, Indonesia
  • A. Taqwa Martadinata Universitas Bina Insan, Lubuklinggau, Indonesia
  • Joni Karman Universitas Bina Insan, Lubuklinggau, Indonesia
  • (*) Corresponding Author
Keywords: Hoax News; Natural Language Processing; Transformer Model; Multilingual-BERT; Undersampling

Abstract

The development of infoormation technology is increasing, which can trigger the spread of hoaxes in online newss media, resulting in misinformation for the public. Manual detection of hoaxes is difficult due to the large volume of online news and variety of language styles and mixed language usage. Advances in artificial intelligence technology, particularly Natural Language Processing (NLP), open up significant opportunities for automatically detecting hoax news. Transformer-based NLP models, such as Multilingual-BERT (mBERT), are emerging. Transformer-based models such as Multilngual-BERT (mBERT) are capable of understanding text context in both directions and support various languages, including Indonesian. Therefore, this study aims to apply and test the effectiveness of mBERT in detecting hoax news in online news media. The dataset used comes from TurnbackHoax on the Kaggle website for hoax news, Tempo and CNN Indonesia for non-hoax news datasets, and additional data from the Kaggle website on news coverage in 2025. The data was processed using the built-in mBERT Tokenizer with data undersampling techniques, resulting in a model with an accuracy of 0.97, a precision value of 0.96, a recall of 0.97, and F1-score of 0.97,which shows that the mBERT approach is able to provide high classification performance on the Indonesian language hoax news dataset.

Downloads

Download data is not yet available.

References

Amin, M. Z. A., Furqon, M. A., & Wijonarko, D. (2025). Deteksi Berita Hoaks Berbahasa Indonesia Menggunakan One-Dimensional Convolutional Neural Network. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 14(2), 161–169. https://doi.org/10.22146/jnteti.v14i2.19050

Brianna, D. F., Paisal, P., Saputra, M. A., & Hapiz, M. A. (2025). Model Deteksi Berita Palsu Menggunakan BERT dan Bi-LSTM Berbasis Discriminative Approach. JSAI (Journal Scientific and Applied Informatics), 8(3), 626–631. https://doi.org/10.36085/jsai.v8i3.9384

Desriansyah, M. D., Sari, I. U., & Zulfahmi, Z. (2025). Analisis Efektivitas Algoritma Machine Learning dalam Deteksi Hoaks: Pada Berita Digital Berbahasa Indonesia. Jurnal Sistem Informasi Dan Informatika, 3(2), 63–69. https://doi.org/10.47233/jiska.v3i1.2024

Deteksi Berita Hoaks Indo—Dataset. (n.d.). Retrieved December 27, 2025, from https://www.kaggle.com/datasets/mochamadabdulazis/deteksi-berita-hoaks-indo-dataset

Fardhina, A., Siregar, R. M., Sibarani, M. R. W. B., Ginting, I. C. B., & Pratama, A. (2025). Sistem Deteksi Berita Hoaks berbasis Algoritma Natural Language Processing (NLP) menggunakan BERT. Jurnal Manajemen Informatika, Sistem Informasi Dan Teknologi Komputer (JUMISTIK), 4(1), 450–461. https://doi.org/10.70247/jumistik.v4i1.156

Fathin, M. A., Sibaroni, Y., & Prasetyowati, S. S. (2024). Handling Imbalance Dataset on Hoax Indonesian Political News Classification using IndoBERT and Random Sampling. JURNAL MEDIA INFORMATIKA BUDIDARMA, 8(1), 352–360. https://doi.org/10.30865/mib.v8i1.7099

Figure 2: Multilingual BERT (mBERT) model. (n.d.). ResearchGate. Retrieved January 4, 2026, from https://www.researchgate.net/figure/Multilingual-BERT-mBERT-model_fig2_362263410

Hanum, A. R., Zetha, I. A., Putri, S. C., Wulandari, R. A., Andina, S. P., Fajrina, J. N., & Yudistira, N. (2024). Analisis Kinerja Algoritma Klasifikasi Teks Bert dalam Mendeteksi Berita Hoaks. Jurnal Teknologi Informasi dan Ilmu Komputer, 11(3), 537–546. https://doi.org/10.25126/jtiik.938093

Hutama, L. B., & Suhartono, D. (2022). Indonesian Hoax News Classification with Multilingual Transformer Model and BERTopic. Informatica, 46(8). https://doi.org/10.31449/inf.v46i8.4336

Indonesia News Dataset (2025). (n.d.). Retrieved December 27, 2025, from https://www.kaggle.com/datasets/sh1zuka/indonesia-news-dataset-2024

Jabar, B. A., Seline, S., Bintang, B., Victoria, C. J., & Arifin, R. N. (2022). COVID-19 Fake News Detection With Pre-trained Transformer Models. Ultimatics : Jurnal Teknik Informatika, 14(2), 51–56. https://doi.org/10.31937/ti.v14i2.2776

Kamal, A. M., Chrisnanto, Y. H., & Yuniarti, R. (2025). Identifikasi Berita Palsu di Portal Media Online Menggunakan Model IndoBERT dan LSTM. JURNAL RISET KOMPUTER (JURIKOM), 12(3), 287–297. https://doi.org/10.30865/jurikom.v12i3.8660

Rahmawati, A., Alamsyah, A., & Romadhony, A. (2022). Hoax News Detection Analysis using IndoBERT Deep Learning Methodology. 2022 10th International Conference on Information and Communication Technology (ICoICT), 368–373. https://doi.org/10.1109/ICoICT55009.2022.9914902

Representation Learning for Natural Language Processing | Springer Nature Link (formerly SpringerLink). (n.d.). Retrieved December 27, 2025, from https://link.springer.com/book/10.1007/978-981-99-1600-9

Riadi, A. T., Indriani, F., Mazdadi, M. I., Faisal, M. R., & Herteno, R. (2025). Cross-Temporal Generalization of IndoBERT for Indonesian Hoax News Classification. Jurnal Teknik Informatika (Jutif), 6(5), 5291–5304. https://doi.org/10.52436/1.jutif.2025.6.5.4757

Suadaa, L. H., Santoso, I., & Panjaitan, A. T. B. (2021). Transfer Learning of Pre-trained Transformers for Covid-19 Hoax Detection in Indonesian Language. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 15(3), 317–326. https://doi.org/10.22146/ijccs.66205

Sukmawati, E. C., Suryaningrum, L., Angelica, D., & Ramadhan, N. G. (2025). Klasifikasi Berita Palsu Menggunakan Model Bidirectional Encoder Representations From Transformers (BERT) | SISINFO: Jurnal Sistem Informasi dan Informatika. https://jurnalunibi.unibi.ac.id/ojs/index.php/SisInfo/article/view/934?utm_source=chatgpt.com

Tobing, C. J. L., Wijayakusuma, I. L., & Harini, L. P. I. (2025). Perbandingan Kinerja IndoBERT dan MBERT Untuk Deteksi Berita Hoaks Politik dalam Bahasa Indonesia. JST (Jurnal Sains dan Teknologi), 14(1), 114–123. https://doi.org/10.23887/jstundiksha.v14i1.92126

Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., … Rush, A. (2020). Transformers: State-of-the-Art Natural Language Processing. In Q. Liu & D. Schlangen (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 38–45). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-demos.6

Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40. https://doi.org/10.1145/3395046


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Deteksi Berita Hoax Berbahasa Indonesia Menggunakan Multilingual-BERT pada Media Berita Online

Dimensions Badge
Article History
Published: 2026-04-27
Abstract View: 187 times
Section
Articles