Analisis Komparatif MLP dan GraphSAGE dalam Deteksi Bot Twitter/X pada Benchmark TwiBot-22
Abstract
Bot accounts on Twitter/X remain a significant challenge because they affect information integrity, distort public discourse, and complicate platform moderation. This article evaluates two bot detection approaches on the TwiBot-22 benchmark: a profile-feature-based Multilayer Perceptron (MLP) and a graph-based GraphSAGE model, using a 12-Stage Evaluation Framework that covers data validation, feature engineering, model training, threshold analysis, feature ablation, and multi-seed evaluation. The study is limited to an offline benchmark setting with 1,000,000 labeled accounts, 13.99% bots and 86.01% humans, and a fixed split of 70% training, 20% validation, and 10% testing. In the single-seed 15-feature comparison, MLP achieved F1(bot) of 0.53 and PR-AUC of 0.48, while GraphSAGE reached F1(bot) of 0.53 and PR-AUC of 0.46. In the confirmatory three-seed evaluation, the user_only_8 configuration produced F1(bot) of 0.53 and PR-AUC of 0.49 with lower variance, whereas all_15 produced F1(bot) of 0.53 and PR-AUC of 0.47 with higher variance. These findings indicate that the more economical profile-only configuration preserves practically identical binary-decision quality, offers better probability ranking quality, and shows lower variance. The main contribution of this article is a feature-economy argument: on TwiBot-22, added graph and feature complexity does not automatically yield proportionate practical gains.
Downloads
References
Blakey, E. (2024). The day data transparency died: How Twitter/X cut off access for social research. Contexts, 23(2), 30–35. https://doi.org/10.1177/15365042241252125
Chen, W., Pacheco, D., Yang, K.-C., & Menczer, F. (2021). Neutral bots probe political bias on social media. Nature Communications, 12(1). https://doi.org/10.1038/s41467-021-25738-6
Cinelli, M., Cresci, S., Quattrociocchi, W., Tesconi, M., & Zola, P. (2022). Coordinated inauthentic behavior and information spreading on Twitter. Decision Support Systems, 160, 113819. https://doi.org/10.1016/j.dss.2022.113819
De la Cruz Huayanay, A., Bazán, J. L., & Russo, C. M. (2024). Performance of evaluation metrics for classification in imbalanced data. Computational Statistics, 40, 1447–1473. https://doi.org/10.1007/s00180-024-01539-5
Feng, S., Tan, Z., Li, R., & Luo, M. (2022). Heterogeneity-aware Twitter bot detection with relational graph transformers. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4), 3977–3985. https://doi.org/10.1609/aaai.v36i4.20314
Feng, S., Tan, Z., Wan, H., Wang, N., Chen, Z., Zhang, B., Zheng, Q., Zhang, W., Lei, Z., Yang, S., Feng, X., Zhang, Q., Wang, H., Liu, Y., Bai, Y., Wang, H., Cai, Z., Wang, Y., Zheng, L., … Luo, M. (2022). TwiBot-22: Towards graph-based Twitter bot detection. Advances in Neural Information Processing Systems, 35, 35254–35269. https://arxiv.org/abs/2206.04564
Feng, S., Wan, H., Wang, N., Li, J., & Luo, M. (2021). TwiBot-20: A Comprehensive Twitter Bot Detection Benchmark. Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 4485–4494. https://doi.org/10.1145/3459637.3482019
Feng, S., Wan, H., Wang, N., & Luo, M. (2021). BotRGCN: Twitter bot detection with relational graph convolutional networks. Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 236–239. https://doi.org/10.1145/3487351.3488336
Ferrara, E. (2023). Social bot detection in the age of ChatGPT: Challenges and opportunities. First Monday, 28(6). https://doi.org/10.5210/fm.v28i6.13185
Fey, M., & Lenssen, J. E. (2019). Fast Graph Representation Learning with PyTorch Geometric (arXiv:1903.02428). arXiv. https://doi.org/10.48550/arXiv.1903.02428
Hayawi, K., Mathew, S., Venugopal, N., Masud, M. M., & Ho, P.-H. (2022). DeeProBot: A hybrid deep neural network model for social bot detection based on user profile data. Social Network Analysis and Mining, 12(1). https://doi.org/10.1007/s13278-022-00869-w
Huang, D., Song, J., & Zhang, X. (2025). Semi-Supervised Social Bot Detection with Relational Graph Attention Transformers and Characteristics of the social environment. Information Fusion, 118, 102956. https://doi.org/10.1016/j.inffus.2025.102956
Küpfer, A. (2024). Nonrandom tweet mortality and data access restrictions: Compromising the replication of sensitive Twitter studies. Political Analysis, 32(4), 493–506. https://doi.org/10.1017/pan.2024.7
Li, Y., Lu, H., & Chen, W. (2026). Neighborhood perceivable graph neural network for relational heterogeneous Twitter bot detection. PLOS ONE, 21(2), e0342686. https://doi.org/10.1371/journal.pone.0342686
Matharaarachchi, S., Domaratzki, M., & Muthukumarana, S. (2021). Assessing feature selection method performance with class imbalance data. Machine Learning with Applications, 6, 100170. https://doi.org/10.1016/j.mlwa.2021.100170
Mazza, M., Avvenuti, M., Cresci, S., & Tesconi, M. (2022). Investigating the difference between trolls, social bots, and humans on Twitter. Computer Communications, 196, 23–36. https://doi.org/10.1016/j.comcom.2022.09.022
Najari, S., Salehi, M., & Farahbakhsh, R. (2022). GANBOT: A GAN-based framework for social bot detection. Social Network Analysis and Mining, 12(1). https://doi.org/10.1007/s13278-021-00800-9
Wang, T., Wang, Z., Li, H., Xia, C., & Zhao, C. (2025). HHG-Bot: A hyperheterogeneous graph-based Twitter bot detection model. IEEE Transactions on Computational Social Systems, 12(5), 3416–3430. https://doi.org/10.1109/TCSS.2025.3543419
Wei, C., Liang, G., & Yan, K. (2024). BotGSL: Twitter bot detection with graph structure learning. The Computer Journal, 67(7), 2486–2497. https://doi.org/10.1093/comjnl/bxae020
Williams, C. K. I. (2021). The effect of class imbalance on precision-recall curves. Neural Computation, 33(4), 853–857. https://doi.org/10.1162/neco_a_01362
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Analisis Komparatif MLP dan GraphSAGE dalam Deteksi Bot Twitter/X pada Benchmark TwiBot-22
Pages: 2384-2392
Copyright (c) 2026 Mochammad Fikri Chaerul Chalik Ramdhan, Sigit Puspito Wigati Jarot

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).













