Analisis Perbandingan Algortima Support Vector Machine, Random Forest dan Naive Bayes Untuk Prediksi Penyakit Kanker Paru-Paru


  • Rendy Alfa Rizky * Mail Universitas Buana Perjuangan Karawang, Karawang, Indonesia
  • Ahmad Fauzi Universitas Buana Perjuangan Karawang, Karawang, Indonesia
  • Dwi Sulistya Kusumaningrum Universitas Buana Perjuangan Karawang, Karawang, Indonesia
  • Hilda Yulia Novita Universitas Buana Perjuangan Karawang, Karawang, Indonesia
  • (*) Corresponding Author
Keywords: Lung Cancer; SVM; Random Forest; Naive Bayes; Prediction

Abstract

The lungs are one of the vital organs responsible for the processes of respiration and blood circulation, with smoking habits being the primary factor contributing to the development of lung cancer. In Indonesia, the prevalence of this disease continues to increase, placing it eighth in the Southeast Asian region. Globally, lung cancer accounts for approximately 11.6% of all cancer cases and 18% of total cancer-related deaths.This study aims to analyze and compare the performance of Support Vector Machine (SVM), Random Forest, and Naïve Bayes algorithms in predicting lung cancer, as well as to determine the best-performing algorithm based on accuracy, precision, and recall metrics. The study utilizes the Lung Cancer Prediction dataset obtained from Kaggle, consisting of 309 instances and 16 attributes. The approach involves the implementation of three machine learning algorithms, namely Support Vector Machine (SVM), Random Forest, and Naïve Bayes. The research process includes data collection, preprocessing, data transformation, feature selection, model development, and evaluation using a confusion matrix. The experimental results show that both SVM and Naïve Bayes achieve the same accuracy of 91.07%, while Random Forest obtains an accuracy of 89.28%. In terms of evaluation metrics, SVM demonstrates more consistent performance with a precision of 95% and recall of 93%, whereas Naïve Bayes shows a higher recall of 95% with a precision of 93%. On the other hand, Random Forest exhibits limitations in identifying non-cancer cases. Based on the overall results, SVM is considered the most optimal method as it provides a better balance of performance. This study indicates that machine learning has significant potential as a supporting tool for early detection of lung cancer in a more accurate and efficient manner.

Downloads

Download data is not yet available.

References

K. Jainudin and A. Abdullah, “Klasifikasi Penyakit Kanker Paru-Paru Menggunakan Metode Decision Tree C4.5,” Justek Jurnal Sains dan Teknologi, vol. 8, no. 3, pp. 232–240, 2025, doi: 10.31764/justek.v8i3.31981.

I. F. Rosyid and H. Pramaditya, “Visual Interpretation of Machine Learning Models ( Random Forest ) for Lung Cancer Risk Classification Using Explainable Artificial Intelligence ( SHAP & LIME ),” JUTIF Jurnal Teknik Informatika, vol. 6, no. 4, pp. 2187–2206, 2025, doi: 10.52436/1.jutif.2025.6.4.4925.

T. D. Putra, E. Utami, and M. P.Kurniawan, “Klasifikasi penderita kanker Paru Paru Menggunakan Algoritma Artificial Neural Network (ANN),” Explore, vol. 12, no. 2, p. 13, 2022, doi: 10.35200/explore.v12i2.568.

D. Septhya et al., “Implementasi Algoritma Decision Tree dan Support Vector Machine untuk Klasifikasi Penyakit Kanker Paru,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. 1, pp. 15–19, 2023, doi: 10.57152/malcom.v3i1.591.

L. Sari, A. Romadloni, and R. Listyaningrum, “Penerapan Data Mining dalam Analisis Prediksi Kanker Paru Menggunakan Algoritma Random Forest,” Infotekmesin, vol. 14, no. 1, pp. 155–162, 2023, doi: 10.35970/infotekmesin.v14i1.1751.

D. H. Depari, Y. Widiastiwi, and M. M. Santoni, “Perbandingan Model Decision Tree, Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung,” Inform. J. Ilmu Komput., vol. 18, no. 3, p. 239, 2022, doi: 10.52958/iftk.v18i3.4694.

Y. Li, X. Wu, P. Yang, G. Jiang, and Y. Luo, “Machine Learning for Lung Cancer Diagnosis , Treatment , and Prognosis,” Genomics. Proteomics Bioinformatics, vol. 20, no. 5, pp. 850–866, 2022, doi: 10.1016/j.gpb.2022.11.003.

B. Shafa, H. H. Handayani, S. Arum, and P. Lestari, “Prediksi Kanker Paru dengan Normalisasi menggunakan Perbandingan Algoritma Random Forest , Decision Tree dan Naïve Bayes,” DECODE Jurnal Pendidikan Teknologi Informasi, vol. 4, no. 3, pp. 1057–1070, 2024, doi: 10.51454/decode.v4i3.779.

S. P. Maurya, P. S. Sisodia, R. Mishra, and D. Pratap, “Performance of machine learning algorithms for lung cancer prediction : a comparative approach,” Sci. Rep., pp. 1–11, 2024, doi: 10.1038/s41598-024-58345-8.

S. Muawanah, U. Muzayanah, M. G. R. Pandin, M. D. S. Alam, and J. P. N. Trisnaningtyas, “Stress and Coping Strategies of Madrasah’s Teachers on Applying Distance Learning During COVID-19 Pandemic in Indonesia,” Qubahan Acad. J., vol. 3, no. 4, pp. 206–218, 2023, doi: 10.48161/Issn.2709-8206.

T. M. T. A. Hamid, R. Sallehuddin, Z. M. Yunos, and A. Ali, “Ensemble Based Filter Feature Selection with Harmonize Particle Swarm Optimization and Support Vector Machine for Optimal Cancer Classification,” Mach. Learn. with Appl., vol. 5, no. May, p. 100054, 2021, doi: 10.1016/j.mlwa.2021.100054.

T. A. Assegie and S. S. J., “A Support Vector Machine and Decision Tree Based Breast Cancer Prediction,” Int. J. Eng. Adv. Technol., vol. 9, no. 3, pp. 2972–2976, 2020, doi: 10.35940/ijeat.a1752.029320.

A. Desiani et al., “Perbandingan Klasifikasi Penyakit Kanker Paru-Paru menggunakan Support Vector Machine dan K-Nearest Neighbor,” J. Process., vol. 18, no. 1, pp. 54–62, 2023, doi: 10.33998/processor.2023.18.1.700.

E. Wulandari, “Klasifikasi Kanker Paru-Paru Menggunakan Metode Naive Bayes,” Int. Res. Big-Data Comput. Technol. I-Robot, vol. 6, no. 2, pp. 20–24, 2022, doi: 10.53514/ir.v6i2.325.

A. N. Am, M. Nurkholifah, and F. K. Oktorina, “Analisa Penyakit Jantung Menggunakan Algoritma Naïve Bayes,” J. Syst. Comput. Eng., vol. 4, no. 1, pp. 26–36, 2023, doi: 10.47650/jsce.v4i1.671.

M. Y. Iskandar and H. W. Nugroho, “Comparative Evaluation of Decision Tree and Random Forest for Lung Cancer Prediction Based on Computational Efficiency and Predictive Accuracy,” JUTIF Jurnal Teknologi Informatika, vol. 6, no. 5, pp. 3392–3404, 2025, doi: 10.52436/1.jutif.2025.6.5.4877.

M. Y. Haffandi, E. Haerani, F. Syafria, and L. Oktavia, “Klasifikasi Penyakit Paru-Paru Dengan Menggunakan Metode Naïve Bayes Classifier,” J. Tek. Inf. dan Komput., vol. 5, no. 2, p. 176, 2022, doi: 10.37600/tekinkom.v5i2.649.

M. Amine et al., “Heliyon Early heart disease prediction using feature engineering and machine learning algorithms,” Heliyon, vol. 10, no. 19, p. e38731, 2024, doi: 10.1016/j.heliyon.2024.e38731.

F. S. Gomiasti, E. Kartikadarma, J. Gondohanindijo, and D. R. I. Moses, “Enhancing Lung Cancer Classification Effectiveness Through Hyperparameter-Tuned Support Vector Machine,” Journal of Computing Theories and Applications, vol. 1 no. 4, pp 396-406, 2024, doi: 10.62411/jcta.10106.

C. M. Lauw, H. Hairani, I. Saifudin, J. X. Guterres, and M. M. Huda, “Combination of Smote and Random Forest Methods for Lung Cancer Classification,” IJECSA International Journal of Engineering and Computer Science Applications, vol. 2, no. 2, pp. 63–70, 2023, doi: 10.30812/IJECSA.v2i2.3333.

A. P. Aulia and Q. Adelia, “Lung Disease Risk Prediction Using Machine Learning Algorithms,” PREDATECS Public Research Journal of Engineering Data Technology and Computer Science, vol. 3, no. July, pp. 70–79, 2025, doi: 10.57152/predatecs.v3i1.1858

I. A. Purnomo, J. Indra, E. E. Awal, and T. Rohana, “Analisis Prediksi Banjir di Indonesia Menggunakan Algoritma Support Vector Machine dan Random Forest,” vol. 6, no. 1, pp. 219–228, 2026, doi: 10.47065/josh.v6i1.5958.

I. Nurul Hassanah, S. Faisal, A. Mutoi Siregar, U. Buana Perjuangan Karawang Jl HSRonggo Waluyo, T. Timur, and J. Barat, “Perbandingan Algoritma Support Vector Machine Dengan Decision Tree Pada Aplikasi Ruang Guru,” Kumpul. J. Ilmu Komput., vol. 10, no. 1, pp. 39–50, 2023.

A. Masruriyah, H. Novita, C. Sukmawati, A. Ramadhan, S. Arif, and B. Dermawan, “Pengukuran Kinerja Model Klasifikasi dengan Data Oversampling pada Algoritma Supervised Learning untuk Penyakit Jantung,” Comput. Sci., vol. 4, no. 1, pp. 62–70, 2024, doi: 10.31294/coscience.v4i1.2389.

N. C. Ramadhan, H. H. H, T. Rohana, and A. M. Siregar, “Optimasi Algoritma Machine Learning Menggunakan Seleksi Fitur Xgboost Untuk Klasifikasi Kanker Payudara." TIN : Terapan Informatika Nusantara vol. 5, no. 2, pp. 162–171, 2024, doi: 10.47065/tin.v5i2.5408.

I. P. Rahayu, A. Fauzi, and J. Indra, “Analisis Sentimen Terhadap Program Kampus Merdeka Menggunakan Naive Bayes Dan Support Vector Machine,” J. Sist. Komput. dan Inform., vol. 4, no. 2, p. 296, 2022, doi: 10.30865/json.v4i2.5381.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Analisis Perbandingan Algortima Support Vector Machine, Random Forest dan Naive Bayes Untuk Prediksi Penyakit Kanker Paru-Paru

Dimensions Badge
Article History
Submitted: 2026-04-08
Published: 2026-05-09
Abstract View: 61 times
PDF Download: 53 times
How to Cite
Rizky, R., Fauzi, A., Kusumaningrum, D., & Novita, H. (2026). Analisis Perbandingan Algortima Support Vector Machine, Random Forest dan Naive Bayes Untuk Prediksi Penyakit Kanker Paru-Paru. Journal of Information System Research (JOSH), 7(3), 763-773. https://doi.org/10.47065/josh.v7i3.9611
Issue
Section
Articles