Topic Detection on Twitter using GloVe with Convolutional Neural Network and Gated Recurrent Unit


Keywords: Twitter; Feature Expansion; GloVe; CNN; GRU

Abstract

Twitter is a social media platform that allows users to share thoughts or information with others for all to see. However, twitters often use abbreviations, slang, and incorrect grammar because tweets are limited to 280 characters. Topic detection often has problems with low accuracy, one method that can be used to overcome this problem is feature expansion. Feature expansion on Twitter is a semantic addition to the process of expanding the original text syllables to make it look like a large Document. That way, feature expansion is used to reduce word mismatches. This study uses the expansion of the GloVe feature with the Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) classification methods. The results show that the topic detection system with the GloVe feature extension and CNN-GRU hybrid classification has an accuracy of 94.41%

Downloads

Download data is not yet available.

References

P. Studi Komunikasi dan Penyiaran Islam and S. Tinggi Agama Islam As-Sunnah Deli Serdang, “Dampak Perkembangan Teknologi Informasi dan Komunikasi Terhadap Budaya Impact of Information Technology Development and Communication on Culture Daryanto Setiawan,” SIMBOLIKA, vol. 4, no. 1, 2018, doi: 10.31289/simbollika.v4i1.1474.

E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Feature Expansion using Word Embedding for Tweet Topic Classification,” in 2016 10th International Conference on Telecommunication Systems Services and Applications (TSSA), Denpasar: IEEE, 2016, pp. 1–5. doi: 10.1109/TSSA.2016.7871085.

R. A. Yahya and E. B. Setiawan, “Feature Expansion with FastText on Topic Classification Using the Gradient Boosted Decision Tree on Twitter,” in 10th International Conference on Information and Communication Technology (ICoICT), Bandung: IEEE, 2022, pp. 322–327. doi: 10.1109/ICoICT55009.2022.9914896.

I. F. Ramadhy and Y. Sibaroni, “Analisis Trending Topik Twitter dengan Fitur Ekspansi FastText Menggunakan Metode Logistic Regression,” JURIKOM (Jurnal Riset Komputer), vol. 9, no. 1, p. 1, Feb. 2022, doi: 10.30865/jurikom.v9i1.3791.

B. Xu and K. Mou, “A High-performance Web Attack Detection Method based on CNN-GRU Model,” in 2020 IEEE 4th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC 2020), Chongqing, China: IEEE, 2020, pp. 804–808. doi: 10.1109/ITNEC48623.2020.9085028.

B. Cao, C. Li, Y. Song, Y. Qin, and C. Chen, “Network Intrusion Detection Model Based on CNN and GRU,” Applied Sciences (Switzerland), vol. 12, no. 9, May 2022, doi: 10.3390/app12094184.

E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Feature expansion for sentiment analysis in twitter,” in International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Institute of Advanced Engineering and Science, Oct. 2018, pp. 509–513. doi: 10.1109/EECSI.2018.8752851.

Alvi Rahmy Royyan and Erwin Budi Setiawan, “Feature Expansion Word2Vec for Sentiment Analysis of Public Policy in Twitter,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 1, pp. 78–84, Feb. 2022, doi: 10.29207/resti.v6i1.3525.

W. W. Ariestya, I. Astuti, and I. M. Wiryana, “Preprocessing For Crawler Of Short Message Social Media,” in 2018 Third International Conference on Informatics and Computing (ICIC), Palembang, Indonesia: IEEE, Oct. 2018, pp. 1–6. doi: 10.1109/IAC.2018.8780451.

J. Hernandez-Gonzalez, I. Inza, and J. A. Lozano, “A Note on the Behavior of Majority Voting in Multi-Class Domains with Biased Annotators,” IEEE Trans Knowl Data Eng, vol. 31, no. 1, pp. 195–200, Jan. 2019, doi: 10.1109/TKDE.2018.2845400.

J. Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, “Comparing automated text classification methods,” International Journal of Research in Marketing, vol. 36, no. 1, pp. 20–38, Mar. 2019, doi: 10.1016/j.ijresmar.2018.09.009.

M. Umer, Z. Imtiaz, S. Ullah, A. Mehmood, G. S. Choi, and B. W. On, “Fake news stance detection using deep learning architecture (CNN-LSTM),” IEEE Access, vol. 8, pp. 156695–156706, 2020, doi: 10.1109/ACCESS.2020.3019735.

M. Anandarajan, C. Hill, and T. Nolan, “Text Preprocessing,” 2019, pp. 45–59. doi: 10.1007/978-3-319-95663-3_4.

M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Gozali, “Improving Text Preprocessing for Student Complaint Document Classification Using Sastrawi,” in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, Jul. 2020. doi: 10.1088/1757-899X/874/1/012017.

J. Yao, “Automated Sentiment Analysis of Text Data with NLTK,” in Journal of Physics: Conference Series, Institute of Physics Publishing, May 2019. doi: 10.1088/1742-6596/1187/5/052020.

E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Measuring information credibility in social media using combination of user profile and message content dimensions,” International Journal of Electrical and Computer Engineering, vol. 10, no. 4, pp. 3537–3549, 2020, doi: 10.11591/ijece.v10i4.pp3537-3549.

L. Dhara J and D. Nikita P, “Stopword Identification and Removal Techniques on TC and IR Applications: A Survey,” in 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India: IEEE, May 2020. doi: 10.1109/ICACCS48705.2020.9074166.

D. Merlini and M. Rossini, “Text categorization with WEKA: A survey,” Machine Learning with Applications, vol. 4, p. 100033, Jun. 2021, doi: 10.1016/j.mlwa.2021.100033.

A. Kadhim, “An Evaluation of Preprocessing Techniques for Text Classification Pattern Recognition View project Improvement text classification using log(TF-IDF) with K-NN Algorithm View project,” Article in International Journal of Computer Science and Information Security, vol. 16, no. 6, pp. 13–22, 2018, doi: 10.5281/zenodo.1296383.

Zankoya Zaxo and Duhok Polytechnic University, “Term Weighting for Feature Extraction on Twitter: A Comparison Between BM25 and TF-IDF,” in 2019 International Conference on Advanced Science and Engineering (ICOASE), Zakho - Duhok, Iraq: IEEE, Apr. 2019, pp. 124–128. doi: 10.1109/ICOASE.2019.8723825.

Z. Zhang, Y. Lei, J. Xu, X. Mao, and X. Chang, “TFIDF-FL: Localizing faults using term frequency-inverse document frequency and deep learning,” IEICE Trans Inf Syst, vol. E102D, no. 9, pp. 1860–1864, 2019, doi: 10.1587/transinf.2018EDL8237.

S. Qaiser and R. Ali, “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents,” Int J Comput Appl, vol. 181, no. 1, pp. 25–29, Jul. 2018, doi: 10.5120/ijca2018917395.

A. Nurdin, B. Anggo, S. Aji, A. Bustamin, and Z. Abidin, “PERBANDINGAN KINERJA WORD EMBEDDING WORD2VEC, GLOVE, DAN FASTTEXT PADA KLASIFIKASI TEKS,” Jurnal TEKNOKOMPAK, vol. 14, no. 2, p. 74, 2020, doi: https://doi.org/10.33365/jtk.v14i2.732.

E. M. Dharma, F. Lumban Gaol, H. Leslie, H. S. Warnars, and B. Soewito, “THE ACCURACY COMPARISON AMONG WORD2VEC, GLOVE, AND FASTTEXT TOWARDS CONVOLUTION NEURAL NETWORK (CNN) TEXT CLASSIFICATION,” J Theor Appl Inf Technol, vol. 31, no. 2, 2022, [Online]. Available: www.jatit.org

L. Deng et al., “News Text Classification Method Based on the GRU_CNN Model,” International Transactions on Electrical Energy Systems, vol. 2022, 2022, doi: 10.1155/2022/1197534.

S. Sridevi, G. R. Karpagam, and B. V. Kumar, “GENETIC ALGORITHM - OPTIMIZED GATED RECURRENT UNIT (GRU) NETWORK FOR SEMANTIC WEB SERVICES CLASSIFICATION,” Malaysian Journal of Computer Science, vol. 35, no. 1, pp. 70–88, 2022, doi: 10.22452/mjcs.vol35no1.5.

M. A. Hossain, R. Karim, R. Thulasiram, N. D. B. Bruce, and Y. Wang, “Hybrid Deep Learning Model for Stock Price Prediction,” in 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India: IEEE, 2018, pp. 1837–1844. doi: 10.1109/SSCI.2018.8628641.

C. N. Dang, M. N. Moreno-García, and F. De La Prieta, “Hybrid Deep Learning Models for Sentiment Analysis,” Complexity, vol. 2021, 2021, doi: 10.1155/2021/9986920.

M. M. Fahmy, “Confusion Matrix in Binary Classification Problems: A Step-by-Step Tutorial,” Journal of Engineering Research, vol. 6, no. 5, 2022, doi: 10.21608/ERJENG.2022.274526.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Topic Detection on Twitter using GloVe with Convolutional Neural Network and Gated Recurrent Unit

Dimensions Badge
Article History
Submitted: 2023-08-10
Published: 2023-09-27
Abstract View: 259 times
PDF Download: 239 times
How to Cite
Ikfini M, M., & Setiawan, E. (2023). Topic Detection on Twitter using GloVe with Convolutional Neural Network and Gated Recurrent Unit. Building of Informatics, Technology and Science (BITS), 5(2), 386−396. https://doi.org/10.47065/bits.v5i2.4057
Section
Articles

Most read articles by the same author(s)