Hate Speech Detection on YouTube Using Long Short-Term Memory and Latent Dirichlet Allocation Method
Abstract
YouTube social media is one of the popular media for all people to become a platform as a means of information and expressing opinions. Opinions can be categorized as hate if they attack something targeted. Hate speech is a behavior, word or action that is prohibited, because it causes violence to any individual and group. Expressing opinions in the form of hate speech is a problem that is still very difficult for the authorities to overcome because it is very common. Therefore, in this study a system was created to detect hate speech in the youtube comment column, using the Long Short-Term Memory and Latent Dirichlet Allocation. In this study, several methods were carried out that aimed to get the best accuracy value and carried out the topic modeling process using Latent Dirichlet Allocation to produce a total of three topics containing words that often appear in youtube comments. Based on the tests that have been obtained, the best accuracy is 0.657 or 66%.
Downloads
References
B. R. Amrutha and K. R. Bindu, “Detecting hate speech in tweets using different deep neural network architectures,” 2019 Int. Conf. Intell. Comput. Control Syst. ICCS 2019, no. Iciccs, pp. 923–926, 2019, doi: 10.1109/ICCS45141.2019.9065763.
S. S. Syam, B. Irawan, and C. Setianingsih, “Hate speech detection on twitter using long short-term memory (LSTM) method,” 2019 4th Int. Conf. Inf. Technol. Inf. Syst. Electr. Eng. ICITISEE 2019, pp. 305–310, 2019, doi: 10.1109/ICITISEE48480.2019.9003992.
À. A. Carracedo and R. J. Mondéjar, “Profiling Hate Speech Spreaders on Twitter,” CEUR Workshop Proc., vol. 2936, pp. 1801–1807, 2021.
E. Erizal, B. Irawan, and C. Setianingsih, “Hate speech detection in Indonesian language on instagram comment section using maximum entropy classification method,” 2019 Int. Conf. Inf. Commun. Technol. ICOIACT 2019, pp. 533–538, 2019, doi: 10.1109/ICOIACT46704.2019.8938593.
B. P. Putra, B. Irawan, C. Setianingsih, F. T. Elektro, U. Telkom, and D. Learning, “Convolutional Neural Network Pada Gambar Hatespeech Detection Using Convolutional Neural Network Algorithm Based on Image,” no. 3, 2019.
D. Riyanta and E. B. Setiawan, “Deteksi Ujaran Kebencian pada Twitter dengan Feature Expansion Menggunakan Fasttext,” e-Proceeding Eng., vol. 8, no. 6, pp. 12449–12458, 2021.
B. N. Saha and A. Senapati, “Hate speech and offensive content identification: LSTM based deep learning approach @ HASOC 2020,” CEUR Workshop Proc., vol. 2826, pp. 290–297, 2020.
D. Pitaloka, M. Nasrun, S. Si, and C. Setianingsih, “DETEKSI UJARAN KEBENCIAN MENGGUNAKAN ALGORITMA WORD2 VEC DAN DEEP BELIEF NETWORK ( DBN ) DETECTION OF HATE SPEECH USING WORD2 VEC AND DEEP BELIEF NETWORK ( DBN ) ALGORITHM,” no. 3, 2019.
N. Rochmawati, H. B. Hidayati, Y. Yamasari, H. P. A. Tjahyaningtijas, W. Yustanti, and A. Prihanto, “Analisa Learning Rate dan Batch Size pada Klasifikasi Covid Menggunakan Deep Learning dengan Optimizer Adam,” J. Inf. Eng. Educ. Technol., vol. 5, no. 2, pp. 44–48, 2021, doi: 10.26740/jieet.v5n2.p44-48.
H. Faris, I. Aljarah, M. Habib, and P. A. Castillo, “Hate speech detection using word embedding and deep learning in the Arabic language context,” ICPRAM 2020 - Proc. 9th Int. Conf. Pattern Recognit. Appl. Methods, no. Icpram 2020, pp. 453–460, 2020, doi: 10.5220/0008954004530460.
D. A. N. Taradhita and I. K. G. D. Putra, “Hate speech classification in Indonesian language tweets by using convolutional neural network,” J. ICT Res. Appl., vol. 14, no. 3, pp. 225–239, 2021, doi: 10.5614/itbj.ict.res.appl.2021.14.3.2.
P. H. Saputro, M. Aristin, and Dy. L. Tyas, “Klasifikasi Lagu Daerah Indonesia Berdasarkan Lirik Menggunakan Metode Tfidf Dan Naïve Bayes,” J. Teknol. Inform. dan Terap., vol. 4, no. 1, pp. 45–50, 2017.
S. H. Mohammed and S. Al-Augby, “LSA & LDA topic modeling classification: Comparison study on E-books,” Indones. J. Electr. Eng. Comput. Sci., vol. 19, no. 1, pp. 353–362, 2020, doi: 10.11591/ijeecs.v19.i1.pp353-362.
Z. Wan, “What Do Programmers Discuss about Blockchain ?,” IEEE Trans. Softw. Eng., 2019.
M. D. R Wahyudi, A. Fatwanto, U. Kiftiyani, and M. Galih Wonoseto, “Topic Modeling of Online Media News Titles during COVID-19 Emergency Response in Indonesia Using the Latent Dirichlet Allocation (LDA) Algorithm,” Telematika, vol. 14, no. 2, pp. 101–111, 2021, doi: 10.35671/telematika.v14i2.1225.
Z. F. Hu, X. T. Si, Y. Luo, S. S. Tang, and F. Jian, “Speaker recognition based on 3dcnn-lstm,” Eng. Lett., vol. 29, no. 2, pp. 463–470, 2021.
S. Boumerdassi, R. Milocco, L. Saidane, and N. Puech, Machine learning for networking, vol. 77, no. 5–6. Paris, France: First International Conference, 2022.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Hate Speech Detection on YouTube Using Long Short-Term Memory and Latent Dirichlet Allocation Method
Pages: 644-650
Copyright (c) 2022 Andi Fadil Adiyaksa, Donny Richasdy, Aditya Firman Ihsan

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).






















