Hate Speech Classification in Tiktok Reviews using TF-IDF Feature Extraction, Differential Evolution Optimization, and Word2Vec Feature Expansion in a Classification System using Recurrent Neural Network (RNN)


  • Rizkialdy Fatha * Mail Telkom University, Indonesia
  • Yuliant Sibaroni Telkom University, Indonesia
  • Sri Suryani Prasetyowati Telkom University, Indonesia
  • (*) Corresponding Author
Keywords: Classification; Term Frequency-Inverse Document Frequency; Word2Vec; Differential Evolution; Recurrent Neural Network

Abstract

In the ever-evolving digital era, social media, especially platforms like TikTok, has become a primary channel for users to share opinions, experiences, and expressions. However, the increasing prevalence of hate speech in reviews on the Google Play Store for the TikTok app indicates the need for a sophisticated approach to identify and classify harmful content. This research is aimed to optimize the classification of hate speech in Google Play reviews of the TikTok app by integrating Term Frequency-Inverse Document Frequency (TF-IDF), Differential Evolution, and Word2Vec within a Recurrent Neural Network (RNN) model. The TF-IDF technique will be used to extract relevant features from a review, while Differential Evolution will be applied to efficiently optimize the model parameters. The use of Word2Vec will enhance the representation of words in the context of app reviews, whereas the RNN model will enable the recognition of temporal patterns in hate speech. The results of this research are expected to contribute significantly to the improvement of hate speech classification on digital platforms focused on app reviews.

Downloads

Download data is not yet available.

References

Howard, J. W., (2019), "Free Speech and Hate Speech," Annual Review of Political Science, Volume 22, Pages 93–109. DOI: https://doi.org/10.1146/annurev-polisci-051517-012343

Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). “The Impact of Features Extraction on Sentiment Analysis,” International Conference on Pervasive Computing Advances and Applications – PerCAA 2019. Elsevier Ltd. DOI:10.1016/j.procs.2019.05.008

Prabowo, W. A., & Azizah, F. (2020). "Sentiment Analysis for Detecting Cyberbullying Using TF-IDF and SVM," RESTI Journal (System Engineering and Information Technology), Vol. 4 No. 6, 1142–1148. ISSN Electronic Media: 2580-0760. DOI: https://doi.org/10.29207/resti.v4i6.2753

Muhammada, P. F., Kusumaningrum, R., Wibowoa, A., (2020), "Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews," Procedia Computer Science, Elsevier B.V, DOI: 10.1016/j.procs.2021.01.061.

Jatnikaa, D., Bijaksana, M. A., Suryania, A. A., (2019), "Word2Vec Model Analysis for Semantic Similarities in English Words," Proceedings of the 4th International Conference on Computer Science and Computational Intelligence 2019 (ICCSCI), Elsevier B.V, DOI: 10.1016/j.procs.2019.08.153

Fauzi, M. Ali, (2019), "Word2Vec model for sentiment analysis of product reviews in Indonesian language," International Journal of Electrical and Computer Engineering (IJECE), Vol. 9, No. 1, February, pp. 525~530. ISSN: 2088-8708, DOI: 10.11591/ijece.v9i1.pp525-530

Patel, Alpna and Tiwari, Arvind K, (2019)., ”Sentiment Analysis by using Recurrent Neural Network,” Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE) 2019, DOI:: https://ssrn.com/abstract=3349572

Wang, Y., Sun, A., Han, J., Liu, Y., & Zhu, X. (2018). "Sentiment Analysis by Capsules." WWW 2018: The 2018 Web Conference, CC BY 4.0 license, DOI: https://doi.org/10.1145/3178876.3186015

Cen, P., Zhang, K., & Zheng, D. (2020). "Sentiment Analysis Using Deep Learning Approach." Journal on Artificial Intelligence (JAI), vol. 2, no. 1, pp. 17-27. DOI: 10.32604/jai.2020.010132.

Vijayaprabakaran, K., & Sathiyamurthy, K. (2020). "Towards activation function search for long short-term model network: A Differential Evolution based approach." Journal of King Saud University - Computer and Information Sciences. DOI: https://doi.org/10.1016/j.jksuci.2020.04.015

Dahou, A., Elaziz, M. A., Zhou, J., & Xiong, S. (2019). "Arabic Sentiment Classification Using Convolutional Neural Network and Differential Evolution Algorithm." Computational Intelligence and Neuroscience, 2019, Article ID 2537689. DOI: https://doi.org/10.1155/2019/2537689

Fauzi, M. A., & Yuniarti, A. (2018). "Ensemble Method for Indonesian Twitter Hate Speech Detection." Indonesian Journal of Electrical Engineering and Computer Science, 11(1), 294–299. DOI: 10.11591/ijeecs.v11.i1.pp294-299

Cano, E., & Morisio, M. (2019). "Word Embeddings for Sentiment Analysis: A Comprehensive Empirical Survey." Arxiv preprint arXiv:1902.00753. DOI: https://doi.org/10.48550/arXiv.1902.00753

Y. Yu, X. Si, C. Hu and J. Zhang. (2019) "A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures," in Neural Computation, vol. 31, no. 7, pp. 1235-1270, DOI: 10.1162/neco_a_01199.

Shi, Weiwei & Gong, Yihong & Ding, Chris & Ma, Zhiheng & Tao, Xiaoyu & Zheng, Nanning. (2018). Transductive Semi-Supervised Deep Learning Using Min-Max Features: 15th European Conference, DOI: 10.1007/978-3-030-01228-1_19.

Merullo, J., Eickhoff, C., & Pavlick, E. (2023). Language Models Implement Simple Word2Vec-style Vector Arithmetic. North American Chapter of the Association for Computational Linguistics. DOI: https://doi.org/10.48550/arXiv.2305.16130

Xiao, S., Huang, S., Lin, Y., Ye, Y., & Zeng, W. (2023). Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model. IEEE Transactions on Visualization and Computer Graphics, 30, 284-294. DOI: https://doi.ieeecomputersociety.org/10.1109/TVCG.2023.3326913

Neutatz, F., Chen, B., Alkhatib, Y., Ye, J., & Abedjan, Z. (2022). Data Cleaning and AutoML: Would an Optimizer Choose to Clean?. Datenbank Spektrum 22, 121–130. DOI: https://doi.org/10.1007/s13222-022-00413-2

Tangmanee, C. (2018). User Test on Text-Based CAPTCHA: A Letter Case Examination. Journal of Applied Security Research, 13(2), 250–266. DOI: https://doi.org/10.1080/19361610.2018.1422372

Achsan, H. T. Y., Suhartanto, H., Wibowo, W., Dewi, D. A., & Ismed, K. (2023). Automatic Extraction of Indonesian Stopwords. International Journal of Advanced Computer Science and Applications. DOI: 10.14569/IJACSA.2023.0140221

Sun, L., Zhao, G., Zheng, Y., & Wu, Z. (2022). Spectral–Spatial Feature Tokenization Transformer for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-14. DOI: 10.1109/TGRS.2022.3144158


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Hate Speech Classification in Tiktok Reviews using TF-IDF Feature Extraction, Differential Evolution Optimization, and Word2Vec Feature Expansion in a Classification System using Recurrent Neural Network (RNN)

Dimensions Badge
Article History
Submitted: 2024-07-08
Published: 2024-09-09
Abstract View: 35 times
PDF Download: 24 times
How to Cite
Fatha, R., Sibaroni, Y., & Prasetyowati, S. (2024). Hate Speech Classification in Tiktok Reviews using TF-IDF Feature Extraction, Differential Evolution Optimization, and Word2Vec Feature Expansion in a Classification System using Recurrent Neural Network (RNN). Building of Informatics, Technology and Science (BITS), 6(2), 807−816. Retrieved from https://ejurnal.seminar-id.com/index.php/bits/article/view/5520
Section
Articles