Prediksi Insomnia Berdasarkan Aktivitas Pengguna Twitter Menggunakan Natural Language Processing dan Machine Learning

Trisna Trisna; Asti Heliana

doi:10.47065/josyc.v6i4.8081

Trisna Trisna * Universitas Adhirajasa Reswara Sanjaya, Bandung, Indonesia
Asti Heliana Universitas Adhirajasa Reswara Sanjaya, Bandung, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/josyc.v6i4.8081

Keywords: Ensemble Learning; Insomnia; LSTM; SVM; Twitter

Abstract

Insomnia is a sleep disorder that is widely experienced by the public and has a significant impact on physical and mental health, as well as productivity. However, early detection of insomnia remains a challenge because its symptoms are difficult to identify directly. This study uses historical data of 13,950 tweets from 4,286 Twitter accounts (January 1–April 30, 2025) to predict potential insomnia using Natural Language Processing (NLP) and machine learning methods. Insomnia labels are determined through an expert-verified keyword-based approach, followed by preprocessing, temporal analysis, and sentiment analysis. Two classification models are used: Support Vector Machine (SVM), which excels at separating classes in high-dimensional data, and Long Short-Term Memory (LSTM), which excels at capturing sequential patterns and temporal context. Preliminary results showed that SVM had 89% accuracy and was superior in the non-insomnia class (precision 0.80, recall 0.97) but suboptimal in insomnia (precision 0.92, recall 0.82), while LSTM had 90% accuracy and was better in insomnia (precision 0.98, recall 0.86) but slightly inferior in non-insomnia (precision 0.81, recall 0.96). Since each model had different strengths, they were combined with a probabilistic ensemble averaging method which resulted in 92% accuracy with balanced improvements in both classes (non-insomnia: precision 0.82, recall 0.99; insomnia: precision 1.00, recall 0.88), making it more reliable than a single model in detecting potential insomnia.

Downloads

Download data is not yet available.

References

M. M. AlRasheed et al., “The prevalence and severity of insomnia symptoms during COVID-19: A global systematic review and individual participant data meta-analysis,” Sleep Med, vol. 100, pp. 7–23, Dec. 2022, doi: 10.1016/j.sleep.2022.06.020.

I. Irawati, K. Kistan, and M. Basri, “The Effect of the Duration of Social Media Use on the Incidence of Student Insomnia,” Jurnal Ilmiah Kesehatan Sandi Husada, vol. 12, no. 1, pp. 176–182, Jun. 2023, doi: 10.35816/jiskh.v12i1.942.

S. Madari, R. Golebiowski, M. P. Mansukhani, and B. Prakash Kolla, “Pharmacological Management of Insomnia,” Neurotherapeutics, pp. 44–62, Jan. 2021, doi: 10.1007/s13311-021-01010-z/Published.

E. J. W. Van Someren, “Brain Mechanisms Of Insomnia: New Perspectives On Causes And Consequences,” Jul. 01, 2021, American Physiological Society. doi: 10.1152/physrev.00046.2019.

F. N. Muhammad, F. Hidayatullah, M. Saddam, A. Andalusi, H. Peristiwo, and W. Hidayat, “Analisis Penggunaan Media Sosial Terhadap Kualitas Tidur Pada Mahasiswa Fakultas Ekonomi Dan Bisnis Islam,” SANTRI: Jurnal Ekonomi dan Keuangan Islam, vol. 2, no. 4, pp. 62–69, Aug. 2024, doi: 10.61132/santri.v2i3.726.

I. D. Nugraha and Y. Azhar, “Deteksi Depresi Pengguna Twitter Indonesia Menggunakan LSTM-RNN,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 11, no. 3, pp. 320–329, Dec. 2022, doi: 10.23887/janapati.v11i3.50674.

A. S. Sakib, M. S. H. Mukta, F. R. Huda, A. K. M. Najmul Islam, T. Islam, and M. E. Ali, “Identifying Insomnia from Social Media Posts: Psycholinguistic Analyses of User Tweets,” J Med Internet Res, vol. 23, no. 12, Dec. 2021, doi: 10.2196/27613.

A. Kumar, P. Makhija, and A. Gupta, “Noisy Text Data: Achilles’ Heel of BERT,” Mar. 2020, [Online]. Available: http://arxiv.org/abs/2003.12932

F. J. Griffith et al., “Natural language processing in mixed-methods evaluation of a digital sleep-alcohol intervention for young adults,” NPJ Digit Med, vol. 7, no. 1, Dec. 2024, doi: 10.1038/s41746-024-01321-3.

D. Kreuzberger, N. Kuhl, and S. Hirschl, “Machine Learning Operations (MLOps): Overview, Definition, and Architecture,” IEEE Access, vol. 11, pp. 31866–31879, 2023, doi: 10.1109/ACCESS.2023.3262138.

A. Kharel, Z. Zarean, and D. Kaur, “Long Short-Term Memory (LSTM) Based Deep Learning Models for Predicting Univariate Time Series Data,” International Journal of Machine Learning, vol. 14, no. 1, 2024, doi: 10.18178/ijml.2024.14.1.1154.

D. Arisandi, T. Sutrisno, and I. Kurniawan, “Klasifikasi Opini Masyarakat Di Twitter Tentang Kebocoran Data Yang Terjadi Di Indonesia Menggunakan Algoritma SVM,” Jurnal Teknika, vol. 15, no. 2, pp. 75–80, Sep. 2023, doi: 10.30736/jt.v15i2.993.

A. R. Fitriansyah, “Analisis Sentimen Terhadap Pembangunan Kereta Cepat Jakarta-Bandung Pada Media Sosial Twitter Menggunakan Metode SVM dan GloVe Word Embedding,” e-Proceeding of Engineering, vol. 10, no. 2, p. 1713, Apr. 2023.

B. Rizki, N. Hidayat, and R. Sanjaya, “Penerapan Text Mining Dengan Algoritma Random Forest Menganalisis Sentimen Ulasan SATUSEHAT Mobile,” E-PROSIDING TEKNIK INFORMATIKA, vol. 5, no. 2, p. 209, Nov. 2024.

S. Khairunnisa, A. Adiwijaya, and S. Al Faraby, “Pengaruh Text Preprocessing terhadap Analisis Sentimen Komentar Masyarakat pada Media Sosial Twitter (Studi Kasus Pandemi COVID-19),” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 5, no. 2, p. 406, Apr. 2021, doi: 10.30865/mib.v5i2.2835.

M. F. Karaca, “Effects of preprocessing on text classification in balanced and imbalanced datasets,” KSII Transactions on Internet and Information Systems, vol. 18, no. 3, pp. 591–609, Mar. 2024, doi: 10.3837/tiis.2024.03.004.

A. Erkan and T. Gungor, “Analysis of Deep Learning Model Combinations and Tokenization Approaches in Sentiment Classification,” IEEE Access, vol. 11, pp. 134951–134968, 2023, doi: 10.1109/ACCESS.2023.3337354.

M. Purba et al., “Effect of Random Splitting and Cross Validation for Indonesian Opinion Mining using Machine Learning Approach,” IJACSA) International Journal of Advanced Computer Science and Applications, vol. 13, no. 9, 2022, [Online]. Available: www.ijacsa.thesai.org

M. T. Abraham, N. Satyam, R. Lokesh, B. Pradhan, and A. Alamri, “Factors affecting landslide susceptibility mapping: Assessing the influence of different machine learning approaches, sampling strategies and data splitting,” Land (Basel), vol. 10, no. 9, Sep. 2021, doi: 10.3390/land10090989.

R. R. Andarista and A. Jananto, “Penerapan Data Mining Algoritma C4.5 Untuk Klasifikasi Hasil Pengujian Kendaraan Bermotor,” Jurnal TEKNO KOMPAK, vol. 16, no. 2, pp. 29–43, 2022.

N. Arifin, U. Enri, and N. Sulistiyowati, “Penerapan Algoritma Support Vector Machine (Svm) Dengan Tf-Idf N-Gram Untuk Text Classification,” Satuan Tulisan Riset dan Inovasi Teknologi, vol. 6, pp. 129–13, Dec. 2021.

Y. Sari and D. H. Prasetya, “Literasi Media Digital Pada Remaja, Ditengah Pesatnya Perkembangan Media Sosial,” Jurnal Dinamika Ilmu Komunikasi, vol. 8, no. 1, pp. 12–25, 2022.

T. Muhammad, R. Rahardiansyah, R. Setya Perdana, and T. N. Fatyanosa, “Analisis Teknik Embedding Model NV-Embed pada Large Language Models Berbasis Retrieval Augmented Generation,” 2025. [Online]. Available: http://j-ptiik.ub.ac.id

I. E. Livieris, E. Pintelas, and P. Pintelas, “A CNN–LSTM model for gold price time-series forecasting,” Neural Comput Appl, vol. 32, no. 23, pp. 17351–17360, Dec. 2020, doi: 10.1007/s00521-020-04867-x.

U. A. Pringsewu et al., “Aisyah Journal of Informatics and Electrical Engineering,” Aisyah Journal Of Informatics and Electrical Engineering, vol. 7, no. 1, pp. 137–145, Feb. 2025, [Online]. Available: http://jti.aisyahuniversity.ac.id/index.php/AJIEE

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Prediksi Insomnia Berdasarkan Aktivitas Pengguna Twitter Menggunakan Natural Language Processing dan Machine Learning