Support Vector Machine and Naïve Bayes for Personality Classification Based on Social Media Posting Patterns
Abstract
This research investigates the use of Support Vector Machine (SVM) and Naive Bayes models to classify the personality traits based on the social media posting patterns. This study integrates textual features obtained from the Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) methods, and along with the feature expansion using the Linguistic Inquiry and Word Count (LIWC) tool, to assess their influence on accuracy Classification Personality characteristics were mapped from social media posts using the Big Five Inventory (BFI-44). The research findings show that the SVM model in which uses the TF-IDF + LIWC feature set, provides the best performance, and achieve 76.60% of accuracy on the base model with a linear kernel. In comparison to the Naive Bayes model performed best with the same feature set, achieving 59.57% accuracy with a smoothing parameter of 1xE-2. Although the oversampling improved recall and precision, the undersampling was found to have a negative effect on model performance. These findings highlight the benefits of combining TF-IDF and LIWC features which improve model effectiveness, with SVM producing the best overall results in personality classification from social media data.
Downloads
References
H. Perera and L. Costa, "Personality Classification of text through Machine learning and Deep learning: A Review," Int. J. Res. Adv. Comput. Sci. Eng., vol. 9, pp. 6–12, 2023, doi: 10.36227/techrxiv.22337746.
S. R. M. and P. M. K. P. S. Dandannavar, "Social Media Text - A Source for Personality Prediction," in Proc. 2018 Int. Conf. Comput. Techn. Electron. Mech. Syst. (CTEMS), IEEE, Jul. 2019.
M. Villeda and R. McCamey, "Use of Social Networking Sites for Recruiting and Selecting in the Hiring Process," Int. Bus. Res., vol. 12, no. 3, p. 66, Feb. 2019, doi: 10.5539/ibr.v12n3p66.
J. Serrano-Guerrero, B. Alshouha, M. Bani-Doumi, F. Chiclana, F. P. Romero, and J. A. Olivas, "Combining machine learning algorithms for personality trait prediction," Egypt. Inform. J., vol. 25, Mar. 2024, doi: 10.1016/j.eij.2024.100439.
F. Mairesse, M. A. Walker, M. R. Mehl, and R. K. Moore, "Using linguistic cues for the automatic recognition of personality in conversation and text," J. Artif. Intell. Res., vol. 30, pp. 457–500, 2007, doi: 10.1613/jair.2349.
E. Ronando, M. Yasa, and E. Indasyah, "Sistem Prediksi Kepribadian Manusia Berdasarkan Status Media Sosial Menggunakan Support Vector Machine," Konvergensi, vol. 17, no. 1, pp. 13–22, Aug. 2021, doi: 10.30996/konv.v17i1.5164..
R. Yunita and C. Sitasi, "Aktivitas Pengungkapan Diri Remaja Putri Melalui Sosial Media Twitter," Jurnal Komunikasi, vol. 10, no. 1, pp. 26–32, 2019. [Online]. Available: http://ejournal.bsi.ac.id/ejurnal/index.php/jkom.
Y. B. N. D. Artissa, I. Asror, and S. A. Faraby, "Personality Classification based on Facebook status text using Multinomial Naïve Bayes method," in Proc. J. Phys. Conf. Ser., Institute of Physics Publishing, May 2019, doi: 10.1088/1742-6596/1192/1/012003.
A. Nugroho and Y. Religia, "Analisis Optimasi Algoritma Klasifikasi Naive Bayes menggunakan Genetic Algorithm dan Bagging," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 3, pp. 504–510, Jun. 2021, doi: 10.29207/resti.v5i3.3067.
A. P. Natasuwarna, "Analisis Sentimen Keputusan Pemindahan Ibu Kota Negara Menggunakan Klasifikasi Naive Bayes," in Sensitif 2019, Makasar, Dec. 2018, pp. 47–53.
G. J. & S. H. Sanhaji, "WFraud Alert Sebagai Prediksi Pesan Penipuan WhatsApp Menggunakan Naïve Bayes," Tekno Kompak, vol. 18, pp. 113–125, 2020.
F. F. Irfani, "Analisis Sentimen Review Aplikasi Ruangguru Menggunakan Algoritma Support Vector Machine," JBMI (Jurnal Bisnis, Manajemen, dan Informatika), vol. 16, no. 3, pp. 258–266, Feb. 2020, doi: 10.26487/jbmi.v16i3.8607.
E. Indrayuni, A. Nurhadi, and D. A. Kristiyanti, "Implementasi Algoritma Naive Bayes, Support Vector Machine, dan K-Nearest Neighbors untuk Analisa Sentimen Aplikasi Halodoc," Faktor Exacta, vol. 14, no. 2, p. 64, Aug. 2021, doi: 10.30998/faktorexacta.v14i2.9697.
A. Fikriani, I. Asror, and Y. R. Murti, "Klasifikasi Kepribadian Berdasarkan Data Twitter dengan Menggunakan Metode Support Vector Machine," in e-Proceeding of Engineering, 2019, pp. 10436–10450.
J. W. Iskandar and Y. Nataliani, "Perbandingan Naïve Bayes, SVM, dan k-NN untuk Analisis Sentimen Gadget Berbasis Aspek," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 6, pp. 1120–1126, Dec. 2021, doi: 10.29207/resti.v5i6.3588.
K. Shin, J. Han, and S. Kang, "MI-MOTE: Multiple imputation-based minority oversampling technique for imbalanced and incomplete data classification," Inf. Sci. (N Y), vol. 575, pp. 80–89, 2021.
A. Smith and B. Johnson, "Efficient parameter tuning for support vector machines in large-scale datasets," IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 8, pp. 2404–2415, 2019.
J. Lee, H. Park, and S. Kim, "Enhanced support vector machines using adaptive kernel functions," Pattern Recognit. Lett., vol. 131, pp. 123–130, 2020.
N. S. Fauziah and R. D. Dana, "Implementasi algoritma Naive Bayes dalam klasifikasi status kesejahteraan masyarakat Desa Gunungsari," Blend Sains Jurnal Teknik, vol. 1, no. 4, pp. 295–305, Mar. 2023, doi: 10.56211/blendsains.v1i4.234.
A. Budiman, J. C. Young, and A. Suryadibrata, "Implementasi algoritma Naive Bayes untuk klasifikasi konten Twitter dengan indikasi depresi," Jurnal Informatika: Jurnal Pengembangan IT, vol. 6, no. 2, 2021.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Support Vector Machine and Naïve Bayes for Personality Classification Based on Social Media Posting Patterns
Pages: 1717-1731
Copyright (c) 2024 Bayu Seno Nugroho, Warih Maharani

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).