Deteksi Berita Hoax Berbahasa Indonesia Menggunakan Multilingual-BERT pada Media Berita Online
Abstract
The development of infoormation technology is increasing, which can trigger the spread of hoaxes in online newss media, resulting in misinformation for the public. Manual detection of hoaxes is difficult due to the large volume of online news and variety of language styles and mixed language usage. Advances in artificial intelligence technology, particularly Natural Language Processing (NLP), open up significant opportunities for automatically detecting hoax news. Transformer-based NLP models, such as Multilingual-BERT (mBERT), are emerging. Transformer-based models such as Multilngual-BERT (mBERT) are capable of understanding text context in both directions and support various languages, including Indonesian. Therefore, this study aims to apply and test the effectiveness of mBERT in detecting hoax news in online news media. The dataset used comes from TurnbackHoax on the Kaggle website for hoax news, Tempo and CNN Indonesia for non-hoax news datasets, and additional data from the Kaggle website on news coverage in 2025. The data was processed using the built-in mBERT Tokenizer with data undersampling techniques, resulting in a model with an accuracy of 0.97, a precision value of 0.96, a recall of 0.97, and F1-score of 0.97,which shows that the mBERT approach is able to provide high classification performance on the Indonesian language hoax news dataset.
Downloads
References
Amin, M. Z. A., Furqon, M. A., & Wijonarko, D. (2025). Deteksi Berita Hoaks Berbahasa Indonesia Menggunakan One-Dimensional Convolutional Neural Network. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 14(2), 161–169. https://doi.org/10.22146/jnteti.v14i2.19050
Brianna, D. F., Paisal, P., Saputra, M. A., & Hapiz, M. A. (2025). Model Deteksi Berita Palsu Menggunakan BERT dan Bi-LSTM Berbasis Discriminative Approach. JSAI (Journal Scientific and Applied Informatics), 8(3), 626–631. https://doi.org/10.36085/jsai.v8i3.9384
Desriansyah, M. D., Sari, I. U., & Zulfahmi, Z. (2025). Analisis Efektivitas Algoritma Machine Learning dalam Deteksi Hoaks: Pada Berita Digital Berbahasa Indonesia. Jurnal Sistem Informasi Dan Informatika, 3(2), 63–69. https://doi.org/10.47233/jiska.v3i1.2024
Deteksi Berita Hoaks Indo—Dataset. (n.d.). Retrieved December 27, 2025, from https://www.kaggle.com/datasets/mochamadabdulazis/deteksi-berita-hoaks-indo-dataset
Fardhina, A., Siregar, R. M., Sibarani, M. R. W. B., Ginting, I. C. B., & Pratama, A. (2025). Sistem Deteksi Berita Hoaks berbasis Algoritma Natural Language Processing (NLP) menggunakan BERT. Jurnal Manajemen Informatika, Sistem Informasi Dan Teknologi Komputer (JUMISTIK), 4(1), 450–461. https://doi.org/10.70247/jumistik.v4i1.156
Fathin, M. A., Sibaroni, Y., & Prasetyowati, S. S. (2024). Handling Imbalance Dataset on Hoax Indonesian Political News Classification using IndoBERT and Random Sampling. JURNAL MEDIA INFORMATIKA BUDIDARMA, 8(1), 352–360. https://doi.org/10.30865/mib.v8i1.7099
Figure 2: Multilingual BERT (mBERT) model. (n.d.). ResearchGate. Retrieved January 4, 2026, from https://www.researchgate.net/figure/Multilingual-BERT-mBERT-model_fig2_362263410
Hanum, A. R., Zetha, I. A., Putri, S. C., Wulandari, R. A., Andina, S. P., Fajrina, J. N., & Yudistira, N. (2024). Analisis Kinerja Algoritma Klasifikasi Teks Bert dalam Mendeteksi Berita Hoaks. Jurnal Teknologi Informasi dan Ilmu Komputer, 11(3), 537–546. https://doi.org/10.25126/jtiik.938093
Hutama, L. B., & Suhartono, D. (2022). Indonesian Hoax News Classification with Multilingual Transformer Model and BERTopic. Informatica, 46(8). https://doi.org/10.31449/inf.v46i8.4336
Indonesia News Dataset (2025). (n.d.). Retrieved December 27, 2025, from https://www.kaggle.com/datasets/sh1zuka/indonesia-news-dataset-2024
Jabar, B. A., Seline, S., Bintang, B., Victoria, C. J., & Arifin, R. N. (2022). COVID-19 Fake News Detection With Pre-trained Transformer Models. Ultimatics : Jurnal Teknik Informatika, 14(2), 51–56. https://doi.org/10.31937/ti.v14i2.2776
Kamal, A. M., Chrisnanto, Y. H., & Yuniarti, R. (2025). Identifikasi Berita Palsu di Portal Media Online Menggunakan Model IndoBERT dan LSTM. JURNAL RISET KOMPUTER (JURIKOM), 12(3), 287–297. https://doi.org/10.30865/jurikom.v12i3.8660
Rahmawati, A., Alamsyah, A., & Romadhony, A. (2022). Hoax News Detection Analysis using IndoBERT Deep Learning Methodology. 2022 10th International Conference on Information and Communication Technology (ICoICT), 368–373. https://doi.org/10.1109/ICoICT55009.2022.9914902
Representation Learning for Natural Language Processing | Springer Nature Link (formerly SpringerLink). (n.d.). Retrieved December 27, 2025, from https://link.springer.com/book/10.1007/978-981-99-1600-9
Riadi, A. T., Indriani, F., Mazdadi, M. I., Faisal, M. R., & Herteno, R. (2025). Cross-Temporal Generalization of IndoBERT for Indonesian Hoax News Classification. Jurnal Teknik Informatika (Jutif), 6(5), 5291–5304. https://doi.org/10.52436/1.jutif.2025.6.5.4757
Suadaa, L. H., Santoso, I., & Panjaitan, A. T. B. (2021). Transfer Learning of Pre-trained Transformers for Covid-19 Hoax Detection in Indonesian Language. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 15(3), 317–326. https://doi.org/10.22146/ijccs.66205
Sukmawati, E. C., Suryaningrum, L., Angelica, D., & Ramadhan, N. G. (2025). Klasifikasi Berita Palsu Menggunakan Model Bidirectional Encoder Representations From Transformers (BERT) | SISINFO: Jurnal Sistem Informasi dan Informatika. https://jurnalunibi.unibi.ac.id/ojs/index.php/SisInfo/article/view/934?utm_source=chatgpt.com
Tobing, C. J. L., Wijayakusuma, I. L., & Harini, L. P. I. (2025). Perbandingan Kinerja IndoBERT dan MBERT Untuk Deteksi Berita Hoaks Politik dalam Bahasa Indonesia. JST (Jurnal Sains dan Teknologi), 14(1), 114–123. https://doi.org/10.23887/jstundiksha.v14i1.92126
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., … Rush, A. (2020). Transformers: State-of-the-Art Natural Language Processing. In Q. Liu & D. Schlangen (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 38–45). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-demos.6
Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40. https://doi.org/10.1145/3395046
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Deteksi Berita Hoax Berbahasa Indonesia Menggunakan Multilingual-BERT pada Media Berita Online
Copyright (c) 2026 Syalma Tarissa, Muhamad Akbar, A. Taqwa Martadinata, Joni Karman

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).













