Perbandingan Algoritma Support Vector Machine, Decision Tree, Naïve Bayes, dan Neural Network dalam Klasifikasi Email
Abstract
This study aims to compare the effectiveness of four machine learning models in email classification, namely Support Vector Machine (SVM), Decision Tree, Naive Bayes, and Neural Network. This research uses datasets obtained from the Kaggle website. The first dataset contains 18,650 phishing emails (7,328 phishing and 11,322 non-phishing). The second dataset is the result of merging two different datasets containing Indonesian spam emails, resulting in a total of 4,681 emails (2,670 spam and 2,011 non-spam). The merging was done to obtain a more representative amount of data for model evaluation. The results of the study of the two datasets above showed that the Neural Network achieved the highest accuracy with an average of 96.60%. Then, followed by SVM with an average accuracy of 96.43%. Meanwhile, Decision Tree has a fairly high accuracy with an average of 92.38%. In contrast, Naive Bayes recorded the lowest performance with an average accuracy of 90.22%. Although Neural Network has the highest accuracy, other models may be more suitable depending on the needs of the system. Models with lower accuracy, such as Naive Bayes, can be more useful in systems with computational limitations due to their efficiency. SVM offers a balance between high accuracy and computational efficiency, making it an ideal choice for systems that require optimal performance without too much computational burden. Decision Tree is superior in result interpretation, making it suitable for applications that require transparency in decision making.
Downloads
References
R. S. Lutfiyani and N. Retnowati, “Implementasi Pendeteksian Spam Email Menggunakan Metode Text Mining dengan Algoritma Naïve Bayes dan Decision Tree J48,” Jurnal Komputer dan Informatika, vol. 9, no. 2, pp. 244–252, Oct. 2021, [Online]. Available: https://doi.org/10.35508/jicon.v9i2.5304
K. M. S. Hidayatullah and T. Sutabri, “Pengembangan Sistem Pengklasifikasi e-mail Berbasis Kecerdasan Buatan untuk Deteksi Spam dan Phishing,” IJM: Indonesian Journal of Multidisciplinary, vol. 2, no.2, Apr. 2024. [Online]. Available: https://journal.csspublishing/index.php/ijm/article/view/689
D. Anggraini and T. Sutabri, “Pengembangan Aplikasi Penyaringan Spam e-mail Menggunakan Teknik Machine Learning dengan Metode Support Vector Machines,” IJM: Indonesian Journal of Multidisciplinary, vol. 2, no. 3, pp. 106–114, Apr. 2024. [Online]. Available: https://journal.csspublishing/index.php/ijm/article/view/720
A. Kumar, J. M. Chatterjee, and V. G. Díaz, “A Novel Hybrid Approach of SVM Combined with NLP and Probabilistic Neural Network for Email Phishing,” International Journal of Electrical and Computer Engineering, vol. 10, no. 1, pp. 486–493, 2020. [Online]. Available: https://doi.org/10.11591/ijece.v10i1.pp486-493
R. P. Ramadhan and T. Desyani, “Implementasi Algoritma J48 Untuk Identifikasi Website Phising,” BINER: Jurnal Ilmu Komputer, Teknik dan Multimedia, vol. 1, no. 2, pp. 46–54, Jun. 2023. [Online]. Available: https://journal.mediapublikasi.id/index.php/Biner/article/view/2557
Q. Ouyang, J. Tian, and J. Wei, “E-mail Spam Classification using KNN and Naive Bayes,” Highlights in Science, Engineering and Technology, vol. 38, pp. 57–63, Mar. 2023. [Online]. Available: https:// doi.org/10.54097/hset.v38i.5699
N. L. Octaviani, E. H. Rachmawanto, C. A. Sari, and I. M. S. De Rosal, "Comparison of multinomial naïve Bayes classifier, support vector machine, and recurrent neural network to classify email spams," in Proceedings of the 2020 International Seminar on Application for Technology of Information and Communication (iSemantic), Sep. 2020, pp. 17–21. [Online]. Available: https://doi.org/10.1109/iSemantic50169.2020.9234296
F. Alghifari and D. Juardi, “Penerapan Data Mining pada Penjualan Makanan dan Minuman Menggunakan Metode Algoritma Naïve Bayes,” JURNAL ILMIAH INFORMATIKA, vol. 9, no. 02, pp. 75–81, Sep. 2021. [Online]. Available: https://doi.org/10.33884/jif.v9i02.3755
D. Chicco, L. Oneto, and E. Tavazzi, “Eleven quick tips for data cleaning and feature engineering,” PLoS Computational Biology, vol. 18, no. 12, p. e1010718, Dec. 2022. [Online]. Available: https://doi.org/10.1371/journal.pcbi.1010718
M. U. Albab, Y. Karuniawati P, and M. N. Fawaiq, "Optimization of the stemming technique on text preprocessing President 3 periods topic," Jurnal Transformatika, vol. 20, no. 2, pp. 1–12, 2023. [Online]. Available: https://doi.org/10.26623/transformatika.v20i2.5374
. Abidin, A. Junaidi, and Wamiliana, "Text stemming and lemmatization of regional languages in Indonesia: A systematic literature review," Journal of Information Systems Engineering and Business Intelligence, vol. 10, no. 2, pp. 217–231, Jun. 2024. [Online]. Available: https://doi.org/10.20473/jisebi.10.2.217-231
M. J. Prasetyo and I. M. A. Agastya, “Sentiment Analysis of Banking Application Reviews on Google Play Store Using Support Vector Machine Algorithm,” Sistemasi: Jurnal Sistem Informasi, vol. 13, no. 6, pp. 2386–2400, 2024. [Online]. Available: http://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/4536
R. Ramadhani, R. Ramadhanu, and T. Hidayat, “Exploratory Data Analysis (EDA) untuk Mengetahui Distribusi Data Kualitas Susu Sapi,” Jurnal SAINTIKOM (Jurnal Sains Manajemen Informatika dan Komputer), vol. 23, no. 1, pp. 68-76, Feb. 2024. [Online]. Available: https://doi.org/10.53513/jis.v23i1.9500
M. Radhi, A. Amalia, D. R. H. Sitompul, S. H. Sinurat, and E. Indra, "Analisis Big Data dengan Metode Exploratory Data Analysis (EDA) dan Metode Visualisasi Menggunakan Jupyter Notebook," Jurnal Sistem Informasi dan Ilmu Komputer Prima, vol. 4, no. 2, pp. 23–27, 2021. [Online] Available: https://jurnal.unprimdn.ac.id/index.php/JUSIKOM/article/view/2475
S. Sumayah, F. Sembiring, and W. Jatmiko, "Analysis of sentiment of Indonesian community on metaverse using support vector machine algorithm," Jurnal Teknik Informatika (JUTIF), vol. 4, no. 1, pp. 143–150, 2023. [Online]. Available: https://doi.org/10.20884/1.jutif.2023.4.1.417
A. M. R. Armaya, “Pengaruh Feature Selection dan Feature Extraction dalam Peningkatan Akurasi Klasifikasi Kebakaran Hutan,” JuTI “Jurnal Teknologi Informasi,” vol. 3, no. 1, p. 13, Aug. 2024. [Online]. Available: http://dx.doi.org/10.26798/juti.v3i1.1039
W. N. I. Al-Obaydy, H. A. Hashim, Y. A. Najm, and A. A. Jalal, “Document classification using term frequency-inverse document frequency and K-means clustering,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 27, no. 3, p. 1517, Sep. 2022. [Online]. Available: https://doi.org/10.11591/ijeecs.v27.i3.pp1517-1524
A. Nugroho, "Text Analysis dan Text Mining," in Data Science Menggunakan Bahasa R, E. S. Mulyanta, Ed. Jogja: Penerbit Andi, 2024, pp. 112–123.
H. Han, B. Shi, and L. Zhang, “Prediction of landslide sharp increase displacement by SVM with considering hysteresis of groundwater change,” Engineering Geology, vol. 280, p. 105876, Jan. 2021. [Online]. Available: https://doi.org/10.1016/j.enggeo.2020.105876
N. A. Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,” International Journal of Information and Decision Sciences, vol. 12, no. 3, p. 246, 2020. [Online]. Available: https://doi.org/10.1504/IJIDS.2020.108141
M. V. Anand, B. KiranBala, S. R. Srividhya, K. C., M. Younus, and M. H. Rahman, “Gaussian Naïve Bayes Algorithm: A Reliable Technique Involved in the Assortment of the Segregation in Cancer,” Mobile Information Systems, vol. 2022, pp. 1–7, Jun. 2022. [Online]. Available: https://doi.org/10.1155/2022/2436946
D. Singh and N. S. Rajput, "Blockchain Technology for Smart Cities," in Blockchain Technologies, D. Singh and N. S. Rajput, Eds. Singapore: Springer Singapore, 2020, pp. 67–68. [Online]. Available: https://doi.org/10.1007/978-981-15-2205-5
N. K. E. Sapitri, U. Sa’adah, and N. Shofianah, “Knowledge Discovery from Confusion Matrix of Pruned CART in Imbalanced Microarray Data Ovarian Cancer Classification,” Scientific Journal of Informatics, vol. 11, no. 1, pp. 227–236, Feb. 2024. [Online]. Available: https://doi.org/10.15294/sji.v11i1.50077
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Perbandingan Algoritma Support Vector Machine, Decision Tree, Naïve Bayes, dan Neural Network dalam Klasifikasi Email
Pages: 2559-2572
Copyright (c) 2025 Dika Wicaksono, I Made Artha Agastya

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).