Support Vector Machine and Naïve Bayes for Personality Classification Based on Social Media Posting Patterns


  • Bayu Seno Nugroho * Mail Telkom University, Bandung, Indonesia
  • Warih Maharani Telkom University, Bandung, Indonesia
  • (*) Corresponding Author
Keywords: Support Vector Machine; Naive Bayes; Personality Classification; Social Media; Text Classification; BFI-44

Abstract

This research investigates the use of Support Vector Machine (SVM) and Naive Bayes models to classify the personality traits based on the social media posting patterns. This study integrates textual features obtained from the Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) methods, and along with the feature expansion using the Linguistic Inquiry and Word Count (LIWC) tool, to assess their influence on accuracy Classification Personality characteristics were mapped from social media posts using the Big Five Inventory (BFI-44). The research findings show that the SVM model in which uses the TF-IDF + LIWC feature set, provides the best performance, and achieve 76.60% of accuracy on the base model with a linear kernel. In comparison to the Naive Bayes model performed best with the same feature set, achieving 59.57% accuracy with a smoothing parameter of 1xE-2. Although the oversampling improved recall and precision, the undersampling was found to have a negative effect on model performance. These findings highlight the benefits of combining TF-IDF and LIWC features which improve model effectiveness, with SVM producing the best overall results in personality classification from social media data.

Downloads

Download data is not yet available.

References

H. Perera and L. Costa, "Personality Classification of text through Machine learning and Deep learning: A Review," Int. J. Res. Adv. Comput. Sci. Eng., vol. 9, pp. 6–12, 2023, doi: 10.36227/techrxiv.22337746.

S. R. M. and P. M. K. P. S. Dandannavar, "Social Media Text - A Source for Personality Prediction," in Proc. 2018 Int. Conf. Comput. Techn. Electron. Mech. Syst. (CTEMS), IEEE, Jul. 2019.

M. Villeda and R. McCamey, "Use of Social Networking Sites for Recruiting and Selecting in the Hiring Process," Int. Bus. Res., vol. 12, no. 3, p. 66, Feb. 2019, doi: 10.5539/ibr.v12n3p66.

J. Serrano-Guerrero, B. Alshouha, M. Bani-Doumi, F. Chiclana, F. P. Romero, and J. A. Olivas, "Combining machine learning algorithms for personality trait prediction," Egypt. Inform. J., vol. 25, Mar. 2024, doi: 10.1016/j.eij.2024.100439.

F. Mairesse, M. A. Walker, M. R. Mehl, and R. K. Moore, "Using linguistic cues for the automatic recognition of personality in conversation and text," J. Artif. Intell. Res., vol. 30, pp. 457–500, 2007, doi: 10.1613/jair.2349.

E. Ronando, M. Yasa, and E. Indasyah, "Sistem Prediksi Kepribadian Manusia Berdasarkan Status Media Sosial Menggunakan Support Vector Machine," Konvergensi, vol. 17, no. 1, pp. 13–22, Aug. 2021, doi: 10.30996/konv.v17i1.5164..

R. Yunita and C. Sitasi, "Aktivitas Pengungkapan Diri Remaja Putri Melalui Sosial Media Twitter," Jurnal Komunikasi, vol. 10, no. 1, pp. 26–32, 2019. [Online]. Available: http://ejournal.bsi.ac.id/ejurnal/index.php/jkom.

Y. B. N. D. Artissa, I. Asror, and S. A. Faraby, "Personality Classification based on Facebook status text using Multinomial Naïve Bayes method," in Proc. J. Phys. Conf. Ser., Institute of Physics Publishing, May 2019, doi: 10.1088/1742-6596/1192/1/012003.

A. Nugroho and Y. Religia, "Analisis Optimasi Algoritma Klasifikasi Naive Bayes menggunakan Genetic Algorithm dan Bagging," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 3, pp. 504–510, Jun. 2021, doi: 10.29207/resti.v5i3.3067.

A. P. Natasuwarna, "Analisis Sentimen Keputusan Pemindahan Ibu Kota Negara Menggunakan Klasifikasi Naive Bayes," in Sensitif 2019, Makasar, Dec. 2018, pp. 47–53.

G. J. & S. H. Sanhaji, "WFraud Alert Sebagai Prediksi Pesan Penipuan WhatsApp Menggunakan Naïve Bayes," Tekno Kompak, vol. 18, pp. 113–125, 2020.

F. F. Irfani, "Analisis Sentimen Review Aplikasi Ruangguru Menggunakan Algoritma Support Vector Machine," JBMI (Jurnal Bisnis, Manajemen, dan Informatika), vol. 16, no. 3, pp. 258–266, Feb. 2020, doi: 10.26487/jbmi.v16i3.8607.

E. Indrayuni, A. Nurhadi, and D. A. Kristiyanti, "Implementasi Algoritma Naive Bayes, Support Vector Machine, dan K-Nearest Neighbors untuk Analisa Sentimen Aplikasi Halodoc," Faktor Exacta, vol. 14, no. 2, p. 64, Aug. 2021, doi: 10.30998/faktorexacta.v14i2.9697.

A. Fikriani, I. Asror, and Y. R. Murti, "Klasifikasi Kepribadian Berdasarkan Data Twitter dengan Menggunakan Metode Support Vector Machine," in e-Proceeding of Engineering, 2019, pp. 10436–10450.

J. W. Iskandar and Y. Nataliani, "Perbandingan Naïve Bayes, SVM, dan k-NN untuk Analisis Sentimen Gadget Berbasis Aspek," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 6, pp. 1120–1126, Dec. 2021, doi: 10.29207/resti.v5i6.3588.

K. Shin, J. Han, and S. Kang, "MI-MOTE: Multiple imputation-based minority oversampling technique for imbalanced and incomplete data classification," Inf. Sci. (N Y), vol. 575, pp. 80–89, 2021.

A. Smith and B. Johnson, "Efficient parameter tuning for support vector machines in large-scale datasets," IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 8, pp. 2404–2415, 2019.

J. Lee, H. Park, and S. Kim, "Enhanced support vector machines using adaptive kernel functions," Pattern Recognit. Lett., vol. 131, pp. 123–130, 2020.

N. S. Fauziah and R. D. Dana, "Implementasi algoritma Naive Bayes dalam klasifikasi status kesejahteraan masyarakat Desa Gunungsari," Blend Sains Jurnal Teknik, vol. 1, no. 4, pp. 295–305, Mar. 2023, doi: 10.56211/blendsains.v1i4.234.

A. Budiman, J. C. Young, and A. Suryadibrata, "Implementasi algoritma Naive Bayes untuk klasifikasi konten Twitter dengan indikasi depresi," Jurnal Informatika: Jurnal Pengembangan IT, vol. 6, no. 2, 2021.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Support Vector Machine and Naïve Bayes for Personality Classification Based on Social Media Posting Patterns

Dimensions Badge
Article History
Submitted: 2024-12-06
Published: 2024-12-19
Abstract View: 57 times
PDF Download: 59 times
How to Cite
Nugroho, B., & Maharani, W. (2024). Support Vector Machine and Naïve Bayes for Personality Classification Based on Social Media Posting Patterns. Building of Informatics, Technology and Science (BITS), 6(3), 1717-1731. https://doi.org/10.47065/bits.v6i3.6411
Issue
Section
Articles

Most read articles by the same author(s)