Analisis Perbandingan Metode Random Forest dan Adaptive Boosting Untuk Prediksi Leukemia dengan Data Microarray
Abstract
Cancer is the uncontrolled growth of cells that spread to other parts of the body. There are different types of cancer that are named after the organ they originate from. One of them is blood cancer or leukemia, which is bone marrow cancer caused by genetic mutations. According to data from Global Cancer Statistics in 2020, there were an estimated 19.3 million new cancer cases and 10 million cancer deaths, and it is estimated that by 2040 it will increase globally by 47% from 19.3 million to 28.4 million new cancer cases. Leukemia is one type of cancer with the ninth rank in Indonesia in 2020, there are 14,979 new cases and 11,530 cases of death caused by leukemia. One of the efforts to prevent leukemia can be done by diagnosing the acute leukemia category using DNA and genetic information. The purpose of this study is to analyze the comparative performance between Random Forest and Adaptive Boosting methods in predicting leukemia types using microarray datasets to determine which method is more effective in performing classification. In this study, the dataset used is gene expression in bone marrow and blood consisting of two categories of acute leukemia, namely Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL) obtained with DNA microarray technology. These genes will be classified using Random Forest and Adaboost methods to predict acute leukemia categories. The results of the analysis process show that the random forest method is a better method for predicting acute leukemia with an Area Under Curve value of 100%, Accuracy 92.9%, Precision 93.7%, Recall 92.9%, and F1-Score 92.7% compared to the AdaBoost method with an Area Under Curve value of 83.3%, Accuracy 85.7%, Precision 88.6%, Recall 85.7%, and F1-Score 85.1%.
Downloads
References
K. Zhu, “Active Learning for Microarray based Leukemia Classification,” in 2021 8th International Conference on Biomedical and Bioinformatics Engineering, New York, NY, USA: ACM, Nov. 2021, pp. 77–81. doi: 10.1145/3502871.3502884.
A. El-Baz and J. S. Suri, Artificial Intelligence in Cancer Diagnosis and Prognosis, Volume 1. IOP Publishing, 2022. doi: 10.1088/978-0-7503-3595-9.
“Cancer statistics for the year 2020: An overview,” Mar. 2021.
E. Morgan et al., “Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN,” Gut, vol. 72, no. 2, pp. 338–344, Feb. 2023, doi: 10.1136/gutjnl-2022-327736.
D. Castillo et al., “Leukemia multiclass assessment and classification from Microarray and RNA-seq technologies integration at gene expression level,” PLoS One, vol. 14, no. 2, p. e0212127, Feb. 2019, doi: 10.1371/journal.pone.0212127.
R. Sheikhpour, R. Fazli, and S. Mehrabani, “Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method,” Iran J Ped Hematol Oncol, Mar. 2021, doi: 10.18502/ijpho.v11i2.5838.
S. A. Naufal, A. Adiwijaya, and W. Astuti, “Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray,” JURIKOM (Jurnal Riset Komputer), vol. 7, no. 1, p. 162, Feb. 2020, doi: 10.30865/jurikom.v7i1.2014.
W. Astuti and A. Adiwijaya, “Principal Component Analysis Sebagai Ekstraksi Fitur Data Microarray Untuk Deteksi Kanker Berbasis Linear Discriminant Analysis,” Jurnal Media Informatika Budidarma, vol. 3, no. 2, p. 72, Apr. 2019, doi: 10.30865/mib.v3i2.1161.
F. Anowar, S. Sadaoui, and B. Selim, “Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, LDA, MDS, SVD, LLE, ISOMAP, LE, ICA, t-SNE),” Comput Sci Rev, vol. 40, p. 100378, May 2021, doi: 10.1016/j.cosrev.2021.100378.
S. A. Naufal, A. Adiwijaya, and W. Astuti, “Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray,” JURIKOM (Jurnal Riset Komputer), vol. 7, no. 1, p. 162, Feb. 2020, doi: 10.30865/jurikom.v7i1.2014.
N. B. Tayfor and S. J. Mohammed, “A Comparison Study of Data Mining Algorithms for blood Cancer Prediction,” Passer Journal of Basic and Applied Sciences, vol. 3, no. 2, pp. 174–179, Sep. 2021, doi: 10.24271/psr.29.
S. Ratnawati, S. Sunendiari, P. Statistika, F. Matematika, D. Ilmu, and P. Alam, “Penggunaan Metode Logistic Regression Ensemble (LORENS) pada Klasifikasi Leukemia Akut”, 2021, doi: 10.29313/.v7i1.25555.
W. W. Piegorsch, “Statistical data analytics : foundations for data mining, informatics, and knowledge discovery,” 2015.
M. J. Paput, K. Suryowati, and M. T. Jatipaningrum, “Perbandingan Metode Random Forest Dan Adaptive Boosting Pada Klasifikasi Indeks Pembangunan Manusia Di Indonesia,” Jurnal Statistika Industri dan Komputasi, vol. 8, no. 2, pp. 73–83, Jul. 2023, doi: 10.34151/statistika.v8i2.4458.
A. C. Kurniawan and A. Salam, “Seleksi Fitur Information Gain untuk Optimasi Klasifikasi Penyakit Tuberkulosis,” Jurnal Media Informatika Budidarma, vol. 8, no. 1, p. 70, Jan. 2024, doi: 10.30865/mib.v8i1.7122.
C. C. Aggarwal, Data Mining. Cham: Springer International Publishing, 2015. doi: 10.1007/978-3-319-14142-8.
C. Schröer, F. Kruse, and J. M. Gómez, “A Systematic Literature Review on Applying CRISP-DM Process Model,” Procedia Comput Sci, vol. 181, pp. 526–534, 2021, doi: 10.1016/j.procs.2021.01.199.
C. Crawford, “Gene expression dataset (Golub et al.),” Access Date Oct 2024, https://www.kaggle.com/datasets/crawford/gene-expression.
Z. I. Bimawan, T. Astuti, and P. Arsi, “Comparison Of Random Forest, K-Nearest Neighbor, Decision Tree, And Xgboost Algorithms For Detecting Stunting In Toddlers Komparasi Algoritma Random Forest, K-Nearest Neighbor, Decision Tree, Xgboost Untuk Mendeteksi Penyakit Stunting Balita,” Jurnal Teknik Informatika (JUTIF), vol. 5, no. 6, pp. 1599–1607, 2024, doi: 10.52436/1.jutif.2024.5.6.2629.
S. Widaningsih, “Perbandingan Metode Data Mining Untuk Prediksi Nilai Dan Waktu Kelulusan Mahasiswa Prodi Teknik Informatika Dengan Algoritma C4,5, Naïve Bayes, Knn Dan Svm,” Jurnal Tekno Insentif, vol. 13, no. 1, pp. 16–25, Apr. 2019, doi: 10.36787/jti.v13i1.78.
H. Azis, P. Purnawansyah, F. Fattah, and I. P. Putri, “Performa Klasifikasi K-NN dan Cross Validation Pada Data Pasien Pengidap Penyakit Jantung,” ILKOM Jurnal Ilmiah, vol. 12, no. 2, pp. 81–86, Aug. 2020, doi: 10.33096/ilkom.v12i2.507.81-86.
D. Desyanti, J. Suarlin, and R. Faisal, “Otoritas Guru Dalam Prestasi Belajar Siswa Menggunakan Fuzzy Mamdani,” Jurnal Media Informatika Budidarma, vol. 7, no. 3, p. 1323, Jul. 2023, doi: 10.30865/mib.v7i3.6368.
F. Salsabila, I. Fitrianti, Y. Umaidah, and N. Heryana, “Penerapan Metode Crisp-Dm Untuk Analisa Pendapatan Bersih Bulanan Pekerja Informal Di Provinsi Jawa Barat Dengan Algoritma K-Means,” Dinamik, vol. 28, no. 2, pp. 97–104, Jul. 2023, doi: 10.35315/dinamik.v28i2.9454.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Analisis Perbandingan Metode Random Forest dan Adaptive Boosting Untuk Prediksi Leukemia dengan Data Microarray
Pages: 1564-1572
Copyright (c) 2025 Juleha Irianti Heremba, Christian Dwi Suhendra, Marlinda Sanglise

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).






















