Comparison of Convolutional Neural Network and Support Vector Machine for Student Question Classification in ChatGPT-based Learning Tools

Brilliant Jordan; Kemas Muslim Lhaksmana

doi:10.47065/bits.v7i2.7841

Brilliant Jordan Telkom University, Bandung, Indonesia
Kemas Muslim Lhaksmana * Telkom University, Bandung, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/bits.v7i2.7841

Keywords: CNN; SVM; ChatGPT; Text Classification; Data Augmentation

Abstract

Artificial Intelligence (AI) has revolutionized educational tools by enabling systems that proactively understand and respond to student needs. ChatGPT, a widely used generative model for education in Indonesia. However, it struggles to classify student questions accurately due to ambiguous phrasing, overlapping sentence structures, and difficulty recognizing intent, which limits its effectiveness as a learning assistant. This study compares the performance of Convolutional Neural Networks (CNN), which extract locally important features from word sequences with Support Vector Machines (SVM) in classifying student questions known for handling high-dimensional data and efficiently finding the optimal hyperplane for text classification. A dataset of 2,797 Indonesian ChatGPT interactions (71% clear vs. 29% unclear) was preprocessed through case folding, stop-word removal, stemming, and tokenisation, followed by data augmentation based on synonyms, which was applied to the minority class to balance the dataset. The models were tuned through grid or random search with prediction testing of the best model using 5-fold cross-validation comparisons across three data splits (70:30, 80:20, and 90:10). Results showed that CNN achieved balanced accuracy, precision, recall, and F1-score of 0.90 on the 90:10 split, outperforming SVM, which plateaued at 0.85 accuracy and dropped to 0.76 in F1-score. The embedded filters of the CNN found generality from lexical variation through the process of augmentation, while the TF-IDF sparse vectors in the SVM failed to maintain this level of semantics. These findings underscore that CNN is more adaptive to diverse data and better suited for integration into ChatGPT-based educational tools, particularly in supporting reliable classification and personalised AI feedback in student learning contexts.

Downloads

Download data is not yet available.

References

E. Erlina et al., “Penerapan Artificial Intelligence pada Aplikasi Chatbot sebagai Sistem Pelayanan dan Informasi Online pada Sekolah,” Journal of Information System and Technology, vol. 4, no. 3, pp. 221–230, Nov. 2023, doi: 10.37253/joint.v4i3.6296.

E. Finanti et al., “Efektivitas Peran Chatgpt Sebagai Alat Bantu Penyelesaian Tugas Akademik Mahasiswa,” Kebumian dan Angkasa, vol. 3, no. 2, pp. 74–85, Mar. 2025, doi: 10.62383/algoritma.v3i2.445.

G. Raposo, R. Ribeiro, B. Martins, and L. Coheur, “Question Rewriting? Assessing Its Importance for Conversational Question Answering,” Advances in Information Retrieval, pp. 199–206, Jan. 2022, doi: 10.1007/978-3-030-99739-7_23.

V. Ramadhan, A. Asriyanik, and A. Pambudi, “Implementasi Algoritma Convolutional Neural Network Untuk Mengidentifikasi Berita Hoaks Berbahasa Indonesia,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 8, no. 5, pp. 10945–10952, Oct. 2024, doi: 10.36040/jati.v8i5.10889.

X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, and M. Parmar, “A review of convolutional neural networks in computer vision,” Artif Intell Rev, vol. 57, no. 4, Apr. 2024, doi: 10.1007/s10462-024-10721-6.

N. Fathirachman Mahing et al., “Klasifikasi Tingkat Stres Dari Data Berbentuk Teks Dengan Menggunakan Algoritma Support Vector Machine (Svm) Dan Random Forest,” Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 11, p. 5, Oct. 2024, doi: 10.25126/jtiik2024118010.

I. A. Oktariansyah, F. R. Umbara, and F. Kasyidi, “Klasifikasi Sentimen Untuk Mengetahui Kecenderungan Politik Pengguna X Pada Calon Presiden Indonesia 2024 Menggunakan Metode IndoBert,” Building of Informatics, Technology and Science (BITS), vol. 6, no. 2, pp. 636–648, Sep. 2024, doi: 10.47065/bits.v6i2.5435.

M. Desiawan and A. Solichin, “SVM Optimization with Grid Search Cross Validation for Improving Accuracy of Schizophrenia Classification Based on EEG Signal,” Jurnal Teknik Informatika, vol. 17, no. 1, pp. 10–20, May 2024, doi: 10.15408/jti.v17i1.37422.

R. Oktafiani, A. Hermawan, and D. Avianto, “Pengaruh Komposisi Split data Terhadap Performa Klasifikasi Penyakit Kanker Payudara Menggunakan Algoritma Machine Learning,” Jurnal Sains dan Informatika, pp. 19–28, Jun. 2023, doi: 10.34128/jsi.v9i1.622.

W. Kurniasari, I. Saladin, B. Azhar, and N. Afifah, “Optimasi Hyperparameter Convolutional Neural Network-Long Short-Term Memory dengan Fitur GloVe untuk Klasifikasi Berita Palsu,” JSI : Jurnal Sistem Informasi (E-Journal), vol. 16, no. 1, 2024, [Online]. Available: http://jsi.ejournal.unsri.ac.id/index.php/jsi/index

Seno B.A, Widodo, and Adhi B.P, “Penerapan Algoritma Support Vector Machine Untuk Mendeteksi Emosi Dari Teks Bahasa Indonesia,” PINTER : Jurnal Pendidikan Teknik Informatika dan Komputer, vol. 8, no. 1, pp. 72–80, Jun. 2024, doi: 10.21009/pinter.8.1.8.

I. M. García-López, C. S. González González, M.-S. Ramírez-Montoya, and J.-M. Molina-Espinosa, “Challenges of implementing ChatGPT on education: Systematic literature review,” International Journal of Educational Research Open, vol. 8, p. 100401, Jun. 2025, doi: 10.1016/j.ijedro.2024.100401.

I. A. Rahma and L. H. Suadaa, “Penerapan Text Augmentation untuk Mengatasi Data yang Tidak Seimbang pada Klasifikasi Teks Berbahasa Indonesia,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 10, no. 6, pp. 1329–1340, Dec. 2023, doi: 10.25126/jtiik.2023107325.

M. Aminudin and D. Wijayanti, “Identifikasi Pertanyaan Yang Diajukan Mahasiswa dalam Memecahkan Masalah Matematika,” JNPM (Jurnal Nasional Pendidikan Matematika), vol. 6, no. 4, p. 594, Dec. 2022, doi: 10.33603/jnpm.v6i4.7132.

R. A. Widiyanti and W. Hadi, “Analisis Keterampilan Bertanya Siswa Berdasarkan Taksonomi Bloom Pada Pembelajaran Bahasa Indonesia Siswa Kelas VIII SMP Negeri 6 Torgamba Tahun Pembelajaran 2020/2021,” KODE: Jurnal Bahasa, vol. 11, 2021, doi: 10.24114/kjb.v10i3.28448.

D. Rifaldi, A. Fadlil, and Herman, “Teknik Preprocessing Pada Text Mining Menggunakan Data Tweet ‘Mental Health,’” Decode: Jurnal Pendidikan Teknologi Informasi, vol. 3, no. 2, pp. 161–171, Apr. 2023, doi: 10.51454/decode.v3i2.131.

I. Risma Huriah, A. Ismania Sita Widianingrum, T. Informatika, and U. Muhammadiyah Riau, “Optimasi Augmentasi Data Berbasis Synonym Replacement pada Klasifikasi Teks Berita Menggunakan Neural Network Optimization of Data Augmentation Based on Synonym Replacement in News Text Classification Using Neural Network,” Jurnal Ilmiah Komputer dan Informatika, vol. 14, no. 1, pp. 2715–7849, 2025, doi: 10.34010/komputa.v14i1.

S. M. Pamungkas, M. A. Yaqin, K. Z. Matondang, A. N. Anggraini, and Abd. C. Fauzan, “Analisis dan Perancangan Software WordNet Bahasa Indonesia dengan Graph Database,” ILKOMNIKA: Journal of Computer Science and Applied Informatics, vol. 2, no. 2, pp. 198–209, Aug. 2020, doi: 10.28926/ilkomnika.v2i2.52.

H. T. Kesgin and M. F. Amasyali, “Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling,” arXiv preprint arXiv:2401.01830, pp. 450–463, Jan. 2024, doi: 10.1007/978-3-031-50920-9_35.

Y. A. Prasetyo, E. Utami, and A. Yaqin, “Pengaruh Komposisi Split Data Terhadap Performa Akurasi Analisis Sentimen Algoritma Naïve Bayes dan SVM,” Journal homepage: Journal of Electrical Engineering and Computer (JEECOM), vol. 6, no. 2, 2024, doi: 10.33650/jeecom.v4i2.

H. Bichri, A. Chergui, and M. Hain, “Investigating the Impact of Train / Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets,” International Journal of Advanced Computer Science and Applications, vol. 15, no. 2, 2024, doi: 10.14569/IJACSA.2024.0150235.

S. Paabanan Simanjuntak, S. Sandino Berutu, and G. C. Setyawan, “Implementasi Metode CNN pada Klasifikasi Sentimen terhadap Pelaksanaan Piala Dunia U-17 (Implementation of the CNN Method in Classifying Sentiments Regarding the Implementation of the U-17 World Cup),” Journal of Engineering and Emerging Technology, vol. 02, no. 01, 2024, [Online]. Available: www.jeet.unram.ac.id

F. Abdusyukur, “Penerapan Algoritma Support Vector Machine (Svm) Untuk Klasifikasi Pencemaran Nama Baik Di Media Sosial Twitter,” Komputa : Jurnal Ilmiah Komputer dan Informatika, vol. 12, no. 1, pp. 73–82, May 2023, doi: 10.34010/komputa.v12i1.9418.

A. Géron, “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems,” 2nd ed., Sebastopol, CA, USA: O’Reilly Media, 2019, ch. 10, pp. 321–325.

A. C. Müller and S. Guido, “Introduction to Machine Learning with Python: A Guide for Data Scientists,” 1st ed., D. Schanafelt, Ed., Sebastopol, CA, USA: O’Reilly Media, 2016, ch. 5, pp. 277–284.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Comparison of Convolutional Neural Network and Support Vector Machine for Student Question Classification in ChatGPT-based Learning Tools

Comparison of Convolutional Neural Network and Support Vector Machine for Student Question Classification in ChatGPT-based Learning Tools

Abstract

Downloads

References

Most read articles by the same author(s)