Hate Speech Classification in Tiktok Reviews using TF-IDF Feature Extraction, Differential Evolution Optimization, and Word2Vec Feature Expansion in a Classification System using Recurrent Neural Network (RNN)
Abstract
In the ever-evolving digital era, social media, especially platforms like TikTok, has become a primary channel for users to share opinions, experiences, and expressions. However, the increasing prevalence of hate speech in reviews on the Google Play Store for the TikTok app indicates the need for a sophisticated approach to identify and classify harmful content. This research is aimed to optimize the classification of hate speech in Google Play reviews of the TikTok app by integrating Term Frequency-Inverse Document Frequency (TF-IDF), Differential Evolution, and Word2Vec within a Recurrent Neural Network (RNN) model. The TF-IDF technique will be used to extract relevant features from a review, while Differential Evolution will be applied to efficiently optimize the model parameters. The use of Word2Vec will enhance the representation of words in the context of app reviews, whereas the RNN model will enable the recognition of temporal patterns in hate speech. The results of this research are expected to contribute significantly to the improvement of hate speech classification on digital platforms focused on app reviews.
Downloads
References
Howard, J. W., (2019), "Free Speech and Hate Speech," Annual Review of Political Science, Volume 22, Pages 93–109. DOI: https://doi.org/10.1146/annurev-polisci-051517-012343
Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). “The Impact of Features Extraction on Sentiment Analysis,” International Conference on Pervasive Computing Advances and Applications – PerCAA 2019. Elsevier Ltd. DOI:10.1016/j.procs.2019.05.008
Prabowo, W. A., & Azizah, F. (2020). "Sentiment Analysis for Detecting Cyberbullying Using TF-IDF and SVM," RESTI Journal (System Engineering and Information Technology), Vol. 4 No. 6, 1142–1148. ISSN Electronic Media: 2580-0760. DOI: https://doi.org/10.29207/resti.v4i6.2753
Muhammada, P. F., Kusumaningrum, R., Wibowoa, A., (2020), "Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews," Procedia Computer Science, Elsevier B.V, DOI: 10.1016/j.procs.2021.01.061.
Jatnikaa, D., Bijaksana, M. A., Suryania, A. A., (2019), "Word2Vec Model Analysis for Semantic Similarities in English Words," Proceedings of the 4th International Conference on Computer Science and Computational Intelligence 2019 (ICCSCI), Elsevier B.V, DOI: 10.1016/j.procs.2019.08.153
Fauzi, M. Ali, (2019), "Word2Vec model for sentiment analysis of product reviews in Indonesian language," International Journal of Electrical and Computer Engineering (IJECE), Vol. 9, No. 1, February, pp. 525~530. ISSN: 2088-8708, DOI: 10.11591/ijece.v9i1.pp525-530
Patel, Alpna and Tiwari, Arvind K, (2019)., ”Sentiment Analysis by using Recurrent Neural Network,” Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE) 2019, DOI:: https://ssrn.com/abstract=3349572
Wang, Y., Sun, A., Han, J., Liu, Y., & Zhu, X. (2018). "Sentiment Analysis by Capsules." WWW 2018: The 2018 Web Conference, CC BY 4.0 license, DOI: https://doi.org/10.1145/3178876.3186015
Cen, P., Zhang, K., & Zheng, D. (2020). "Sentiment Analysis Using Deep Learning Approach." Journal on Artificial Intelligence (JAI), vol. 2, no. 1, pp. 17-27. DOI: 10.32604/jai.2020.010132.
Vijayaprabakaran, K., & Sathiyamurthy, K. (2020). "Towards activation function search for long short-term model network: A Differential Evolution based approach." Journal of King Saud University - Computer and Information Sciences. DOI: https://doi.org/10.1016/j.jksuci.2020.04.015
Dahou, A., Elaziz, M. A., Zhou, J., & Xiong, S. (2019). "Arabic Sentiment Classification Using Convolutional Neural Network and Differential Evolution Algorithm." Computational Intelligence and Neuroscience, 2019, Article ID 2537689. DOI: https://doi.org/10.1155/2019/2537689
Fauzi, M. A., & Yuniarti, A. (2018). "Ensemble Method for Indonesian Twitter Hate Speech Detection." Indonesian Journal of Electrical Engineering and Computer Science, 11(1), 294–299. DOI: 10.11591/ijeecs.v11.i1.pp294-299
Cano, E., & Morisio, M. (2019). "Word Embeddings for Sentiment Analysis: A Comprehensive Empirical Survey." Arxiv preprint arXiv:1902.00753. DOI: https://doi.org/10.48550/arXiv.1902.00753
Y. Yu, X. Si, C. Hu and J. Zhang. (2019) "A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures," in Neural Computation, vol. 31, no. 7, pp. 1235-1270, DOI: 10.1162/neco_a_01199.
Shi, Weiwei & Gong, Yihong & Ding, Chris & Ma, Zhiheng & Tao, Xiaoyu & Zheng, Nanning. (2018). Transductive Semi-Supervised Deep Learning Using Min-Max Features: 15th European Conference, DOI: 10.1007/978-3-030-01228-1_19.
Merullo, J., Eickhoff, C., & Pavlick, E. (2023). Language Models Implement Simple Word2Vec-style Vector Arithmetic. North American Chapter of the Association for Computational Linguistics. DOI: https://doi.org/10.48550/arXiv.2305.16130
Xiao, S., Huang, S., Lin, Y., Ye, Y., & Zeng, W. (2023). Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model. IEEE Transactions on Visualization and Computer Graphics, 30, 284-294. DOI: https://doi.ieeecomputersociety.org/10.1109/TVCG.2023.3326913
Neutatz, F., Chen, B., Alkhatib, Y., Ye, J., & Abedjan, Z. (2022). Data Cleaning and AutoML: Would an Optimizer Choose to Clean?. Datenbank Spektrum 22, 121–130. DOI: https://doi.org/10.1007/s13222-022-00413-2
Tangmanee, C. (2018). User Test on Text-Based CAPTCHA: A Letter Case Examination. Journal of Applied Security Research, 13(2), 250–266. DOI: https://doi.org/10.1080/19361610.2018.1422372
Achsan, H. T. Y., Suhartanto, H., Wibowo, W., Dewi, D. A., & Ismed, K. (2023). Automatic Extraction of Indonesian Stopwords. International Journal of Advanced Computer Science and Applications. DOI: 10.14569/IJACSA.2023.0140221
Sun, L., Zhao, G., Zheng, Y., & Wu, Z. (2022). Spectral–Spatial Feature Tokenization Transformer for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-14. DOI: 10.1109/TGRS.2022.3144158
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Hate Speech Classification in Tiktok Reviews using TF-IDF Feature Extraction, Differential Evolution Optimization, and Word2Vec Feature Expansion in a Classification System using Recurrent Neural Network (RNN)
Pages: 807−816
Copyright (c) 2024 Rizkialdy Fatha, Yuliant Sibaroni, Sri Suryani Prasetyowati
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).