Comparison of Random Forest and Decision Tree for Depression Detection Using Interaction Patterns

Felicia Talitha Fathin; Warih Maharani

doi:10.47065/bits.v6i4.6660

Felicia Talitha Fathin Telkom University, Bandung, Indonesia
Warih Maharani * Telkom University, Bandung, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/bits.v6i4.6660

Keywords: Depression; X; Random Forest; Decision Tree; TF-IDF; Bag of Words; Word2Vec

Abstract

This research focuses on evaluating the efficacy of Random Forest and Decision Tree, in detecting depression on tweets and interaction patterns on X social media. Depression as a global health problem often happens because of individuals' online behavior. This study uses data from X social media users in Indonesia who have filled out the DASS-42 questionnaire with an analysis approach that includes crawling data that includes tweets and interactions on X. The purpose of this research is to more accurately and comprehensively identify signs of depression by analyzing the interaction patterns of users on social media platforms through the integration of of several many methods for feature extraction and preprocessing situations.The methods used include data preprocessing, feature combination using TF-IDF, Bag of Words, and Word2Vec and model evaluation utilizing metrics such as Precision, Recall, Accuracy, and F1-score. The findings of this research show that Random Forest performs better than Decision Tree, with a combination of TF-IDF, BoW, Word2Vec and TF-IDF, Word2Vec features obtained an accuracy of 0.60. Although Random Forest is superior, both models are difficult to identify the positive class of depression which can be seen from the relatively low F1-score and recall values. Other factors affecting model performance include lack of data relevance, low interaction rate, and limited feature extraction.

Downloads

Download data is not yet available.

References

S. Gupta, L. Goel, A. Singh, A. Prasad, and M. A. Ullah, “Retracted: Psychological Analysis for Depression Detection from Social Networking Sites,” Comput. Intell. Neurosci., vol. 2023, no. 1, Jan. 2023, doi: 10.1155/2023/9796187.

T. Dhaker, A. Kumar, and D. A. G, “Detecting Depression on Social Media : A Comprehensive Review of Data Analysis, Deep Learning, NLP, and Machine Learning Approaches,” Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., pp. 103–117, Sep. 2023, doi: 10.32628/CSEIT2390510.

J. G. Choi, K. O. Inhwan, and S. Han, “Depression Level Classification Using Machine Learning Classifiers Based on Actigraphy Data,” IEEE Access, vol. 9, pp. 116622–116646, 2021, doi: 10.1109/ACCESS.2021.3105393.

World Health Organization, “Depressive disorder (depression).” Accessed: Dec. 19, 2024. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/depression

P. Jain, K. R. Srinivas, and A. Vichare, “Depression and Suicide Analysis Using Machine Learning and NLP,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Jan. 2022. doi: 10.1088/1742-6596/2161/1/012034.

Nabila Hisanah Yusri, “Depresi, Kesehatan Mental yang Tak Boleh Disepelekan.” Accessed: Mar. 23, 2024. [Online]. Available: https://www.its.ac.id/news/2023/05/22/__trashed-5/

M. Liu, K. E. Kamper-Demarco, J. Zhang, J. Xiao, D. Dong, and P. Xue, “Time Spent on Social Media and Risk of Depression in Adolescents: A Dose–Response Meta-Analysis,” May 01, 2022, MDPI. doi: 10.3390/ijerph19095164.

J. Angskun, S. Tipprasert, and T. Angskun, “Big data analytics on social networks for real-time depression detection,” J. Big Data, vol. 9, no. 1, Dec. 2022, doi: 10.1186/s40537-022-00622-2.

O. Ahmed, E. I. Walsh, A. Dawel, K. Alateeq, D. A. Espinoza Oyarce, and N. Cherbuin, “Social media use, mental health and sleep: A systematic review with meta-analyses,” Dec. 15, 2024, Elsevier B.V. doi: 10.1016/j.jad.2024.08.193.

A. Abidah and A. Aziz, “Hubungan Antara Intensitas Penggunaan Media Sosial dan Tingkat Depresi pada Mahasiswa,” Acta Psychol., vol. 2, no. 2, pp. 92–107, 2020, [Online]. Available: http://journal.uny.ac.id/index.php/acta-psychologia

L. Azem et al., “Social Media Use and Depression in Adolescents: A Scoping Review,” Jun. 01, 2023, MDPI. doi: 10.3390/bs13060475.

Y. Yan, F. Toriumi, and T. Sugawara, “Understanding how retweets influence the behaviors of social networking service users via agent-based simulation,” Comput. Soc. Networks, vol. 8, no. 1, Dec. 2021, doi: 10.1186/s40649-021-00099-8.

M. R. Hidayatullah and Warih Maharani, “Depression Detection on Twitter Social Media Using Decision Tree,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 4, pp. 677–683, Aug. 2022, doi: 10.29207/resti.v6i4.4275.

A. Renaldi and W. Maharani, “Depression Detection of User in Media Social Twitter Using Random Forest,” J. Inf. Syst. Res., vol. 3, no. 4, pp. 410–416, Jul. 2022, doi: 10.47065/josh.v3i4.1837.

H. Alsagri and M. Ykhlef, “Quantifying Feature Importance for Detecting Depression using Random Forest,” 2020. [Online]. Available: www.ijacsa.thesai.org

A. Kumar, A. Sharma, and A. Arora, “Anxious Depression Prediction in Real-time Social Data,” 2019. doi: 10.48550/arXiv.1903.10222.

F. Azam, M. Agro, M. Sami, M. H. Abro, and A. Dewani, “Identifying Depression among Twitter Users using Sentiment Analysis,” in 2021 International Conference on Artificial Intelligence, ICAI 2021, Institute of Electrical and Electronics Engineers Inc., Apr. 2021, pp. 44–49. doi: 10.1109/ICAI52203.2021.9445271.

Sampath Kayalvizhi and Thenmozhi Durairaj, “Data set creation and empirical analysis for detecting signs of depression from social media postings,” in International Conference on Computational Intelligence in Data Science, Springer International Publishing, Feb. 2022, pp. 136–151. doi: 10.1007/978-3-031-16364-7.

S. A. Sulak and N. Koklu, “Analysis of Depression, Anxiety, Stress Scale (DASS‐42) With Methods of Data Mining,” Eur. J. Educ., vol. 59, no. 4, Dec. 2024, doi: 10.1111/ejed.12778.

S. R. Marsidi, “Identification Of Stress, Anxiety, And Depression Levels Of Students In Preparation For The Exit Exam Competency Test,” J. Vocat. Heal. Stud., vol. 5, no. 2, p. 87, Nov. 2021, doi: 10.20473/jvhs.V5.I2.2021.87-93.

F. Resyanto, Y. Sibaroni, and A. Romadhony, “Choosing The Most Optimum Text Preprocessing Method for Sentiment Analysis ( Case:iPhone Tweets),” in 2019 Fourth International Conference on Informatics and Computing (ICIC), Semarang, Indonesia, 2019, p. 1. doi: 10.1109/ICIC47613.2019.8985943.

M. T. Hossain, M. A. R. Talukder, and N. Jahan, “Depression prognosis using natural language processing and machine learning from social media status,” Int. J. Electr. Comput. Eng., vol. 12, no. 3, pp. 2847–2855, Jun. 2022, doi: 10.11591/ijece.v12i3.pp2847-2855.

S. Fransiska, Rianto Rianto, and A. Irham Gufroni, “Sentiment Analysis Provider by.U on Google Play Store Reviews with TF-IDF and Support Vector Machine (SVM) Method,” Sci. J. Informatics, vol. 7, no. 2, pp. 2407–7658, 2020, doi: https://doi.org/10.15294/sji.v7i2.25596.

H. D. Abubakar and M. Umar, “Sentiment Classification: Review of Text Vectorization Methods: Bag of Words, Tf-Idf, Word2vec and Doc2vec,” SLU J. Sci. Technol., vol. 4, no. 1&2, pp. 27–33, Aug. 2022, doi: 10.56471/slujst.v4i.266.

Chao Lu, Shaofu Lin, Xiliang Liu, and Hui Shi, “Telecom Fraud Identification Based on ADASYN and Random Forest,” in 2020 5th International Conference on Computer and Communication Systems (ICCCS), Shanghai, China: Institute of Electrical and Electronics Engineers (IEEE), 2020, pp. 447–452. doi: 10.1109/ICCCS49078.2020.9118521.

M. Aria, C. Cuccurullo, and A. Gnasso, “A comparison among interpretative proposals for Random Forests,” Mach. Learn. with Appl., vol. 6, p. 100094, Dec. 2021, doi: 10.1016/j.mlwa.2021.100094.

Abdussalam Sulaiman Olainiyi, Saheed Yakub Kayode, Hambali Moshood Abiola, Salau-Ibrahim Taofeekat Tosin, and Akinbowale Nathaniel Babatunde, “Student’s Performance Analysis Using Decision Tree Algorithms,” Ann. Comput. Sci. Inf. Syst., 2017.

P. E. Sumolang and W. Maharani, “Depression Detection on Twitter Using Bidirectional Long Short Term Memory,” Build. Informatics, Technol. Sci., vol. 4, no. 2, pp. 369–376, Sep. 2022, doi: 10.47065/bits.v4i2.1850.

D. Kurniasari, R. Nurul Hidayah, and R. Khoirun Nisa, “Classification Models For Academic Performance: A Comparative Study Of Naïve Bayes And Random Forest Algorithms In Analyzing University Of Lampung Student Grades,” J. Tek. Inform., vol. 5, no. 5, pp. 1267–1276, 2024, doi: 10.52436/1.jutif.2024.5.5.2066.

D. Valero-Carreras, J. Alcaraz, and M. Landete, “Comparing two SVM models through different metrics based on the confusion matrix,” Comput. Oper. Res., vol. 152, Apr. 2023, doi: 10.1016/j.cor.2022.106131.

A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, pp. 375–380, Oct. 2019, doi: 10.22219/kinetik.v4i4.912.

M. A. Palomino and F. Aider, “Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis,” Appl. Sci., vol. 12, no. 17, Sep. 2022, doi: 10.3390/app12178765.

K. Rahayu, V. Fitria, D. Septhya, R. Rahmaddeni, and L. Efrizoni, “Klasifikasi Teks untuk Mendeteksi Depresi dan Kecemasan pada Pengguna Twitter Berbasis Machine Learning,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. 2, pp. 108–114, Sep. 2023, doi: 10.57152/malcom.v3i2.780.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Comparison of Random Forest and Decision Tree for Depression Detection Using Interaction Patterns

Comparison of Random Forest and Decision Tree for Depression Detection Using Interaction Patterns

Abstract

Downloads

References

Most read articles by the same author(s)