Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling
Abstract
Phishing websites are one of the most prevalent forms of cyberattacks and have the potential to cause significant losses, both financially and non-financially. Automatic phishing detection using machine learning algorithms has become an effective solution to address this threat. This study aims to classify phishing websites using the Extreme Gradient Boosting (XGBoost) algorithm and to address the issue of class imbalance by applying the Radial Based Undersampling (RBU) method. In addition, hyperparameter tuning was performed using the Random Search method to optimize the model's performance. The dataset used was obtained from the Kaggle platform and exhibits an imbalanced class distribution, where the number of non-phishing instances far exceeds phishing instances. This imbalance can lead to a biased model and reduce its ability to detect minority class patterns. Based on the evaluation results, the application of RBU significantly improved the model’s capability in detecting phishing instances, while hyperparameter tuning further enhanced its accuracy. The best model was achieved through a combination of RBU and Random Search, reaching an accuracy of 90.39% on the test data. These findings indicate that the combined approach of data balancing and model optimization provides an effective solution for phishing website classification and can be applied to similar cases in the field of cybersecurity.
Downloads
References
V. A. Windarni, A. F. Nugraha, S. T. A. Ramadhani, D. A. Istiqomah, F. M. Puri, and A. Setiawan, “Deteksi Website Phishing Menggunakan Teknik Filter Pada Model Machine Learning,” Inf. Syst. J., vol. 6, no. 01, pp. 39–43, 2023, doi: 10.24076/infosjournal.2023v6i01.1268.
A. Raihan, M. Fadhli, T. Engineering, and P. N. Sriwijaya, “Implementation Of Deep Learning For Detecting Phishing Attacks On Websites With Combination Of Cnn And Lstm Implementasi Deep Learning Dalam Mendeteksi Serangan,” J. Tek. Inform., vol. 5, no. 5, pp. 1451–1459, 2024, doi: 10.52436/1.jutif.2024.5.5.2446.
T. F. Handoyo, M. Pajar, and K. Putra, “Optimasi Bobot Kelas LSTM untuk Deteksi URL Phishing pada Dataset Tidak Berimbang,” J. Inform. J. Pengemb. IT, vol. 10, no. 1, pp. 20–36, 2025, doi: 10.30591/jpit.v10i1.8128.
M. Erkamim, S. Suswadi, M. Z. Subarkah, and E. Widarti, “Komparasi Algoritme Random Forest dan XGBoosting dalam Klasifikasi Performa UMKM,” J. Sist. Inf. Bisnis, vol. 13, no. 2, pp. 127–134, 2023, doi: 10.21456/vol13iss2pp127-134.
M. W. Dwinanda, N. Satyahadewi, and W. Andani, “Classification of Student Graduation Status Using Xgboost Algorithm,” BAREKENG J. Ilmu Mat. dan Terap., vol. 17, no. 3, pp. 1785–1794, 2023, doi: 10.30598/barekengvol17iss3pp1785-1794.
L. Wulandari, “Optimisasi Algoritma Xgboost Untuk Prediksi Hasil Pemilu,” J. Dunia Data, vol. 1, no. 5, pp. 1–16, 2024, [Online]. Available: http://www.portaldata.org/index.php/duniadata/article/view/100
A. Syukron, S. Sardiarinto, E. Saputro, and P. Widodo, “Penerapan Metode Smote Untuk Mengatasi Ketidakseimbangan Kelas Pada Prediksi Gagal Jantung,” J. Teknol. Inf. dan Terap., vol. 10, no. 1, pp. 47–50, 2023, doi: 10.25047/jtit.v10i1.313.
M. Kavitha, “Comparative Analysis of SMOTE Techniques and Machine Learning Models for Imbalanced Medical Datasets,” IEEE Conf. Proc., June, 2024
K. Omari and A. Oukhatar, “Advanced Phishing Website Detection with SMOTETomek-XGB: Addressing Class Imbalance for Optimal Results,” Procedia Comput. Sci., vol. 252, pp. 289–295, 2025, doi: 10.1016/j.procs.2024.12.031.
M. Adhikari and S. Pandey, “A Comparative Analysis of Support Vector Machines , Decision Trees , and Long Short-Term Memory Networks in Phishing Website Detection,” Int. J. Res. Publ., vol. 159, no. 1, pp. 190–199, 2024, doi: 10.47119/IJRP10015911020247261.
A. Kharis Pratama, H. Ashaury, and F. Rakhmat Umbara, “Klasifikasi Data Gempa Bumi Di Pulau Jawa Menggunakan Algoritma Extreme Gradient Boosting,” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 4, pp. 2923–2929, 2024, doi: 10.36040/jati.v7i4.7296.
D. Kurnia, M. Itqan Mazdadi, D. Kartini, R. Adi Nugroho, and F. Abadi, “Seleksi Fitur dengan Particle Swarm Optimization pada Klasifikasi Penyakit Parkinson Menggunakan XGBoost,” J. Teknol. Inf. dan Ilmu Komput., vol. 10, no. 5, pp. 1083–1094, 2023, doi: 10.25126/jtiik.20231057252.
M. Koziarski, “Radial-Based Undersampling Algorithm for Classification of Breast Cancer Histopathological Images Affected by Data Imbalance,” Pattern Recognit., no. 1, pp. 2–6, 2019, doi: 10.1016/j.patcog.2020.107262.
M. Dava Maulana, A. Id Hadiana, and F. Rakhmat Umbara, “Algoritma Xgboost Untuk Klasifikasi Kualitas Air Minum,” JATI (Jurnal Mhs. Tek. Inform.), vol. 7, no. 5, pp. 3251–3256, 2024, doi: 10.36040/jati.v7i5.7308.
A. C. Nugraha and M. I. Irawan, “Komparasi Deteksi Kecurangan pada Data Klaim Asuransi Pelayanan Kesehatan Menggunakan Metode Support Vector Machine (SVM) dan Extreme Gradient Boosting (XGBoost),” J. Sains dan Seni ITS, vol. 12, no. 1, 2023, doi: 10.12962/j23373520.v12i1.107032.
L. Zhang, W. Bian, W. Qu, L. Tuo, and Y. Wang, “Time series forecast of sales volume based on XGBoost,” J. Phys. Conf. Ser., vol. 1873, no. 1, 2021, doi: 10.1088/1742-6596/1873/1/012067.
Y. Rombe, S. A. Thamrin, and A. Lawi, “Application of Adaptive Synthetic Nominal and Extreme Gradient Boosting Methods in Determining Factors Affecting Obesity: A Case Study of Indonesian Basic Health Research Survey 2013,” Indones. J. Stat. Its Appl., vol. 6, no. 2, pp. 309–317, 2022, doi: 10.29244/ijsa.v6i2p309-317.
S. Fatika, N. Halim, and D. Aktuaria, “Analisis Perbandingan Klasifikasi dan Penerapan SMOTE Dalam Imbalanced Data pada Credit Card Default,” J. Sains dan Seni ITS 12(2), vol. 12, no. 2, 2023, doi: 10.12962/j23373520.v12i2.111833.
M. Fajri and A. Primajaya, “Komparasi Teknik Hyperparameter Optimization pada SVM untuk Permasalahan Klasifikasi dengan Menggunakan Grid Search dan Random Search,” J. Appl. Informatics Comput., vol. 7, no. 1, pp. 14–19, 2023, doi: 10.30871/jaic.v7i1.5004.
Euis Saraswati, Yuyun Umaidah, and Apriade Voutama, “Penerapan Algoritma Artificial Neural Network untuk Klasifikasi Opini Publik Terhadap Covid-19,” Gener. J., vol. 5, no. 2, pp. 109–118, 2021, doi: 10.29407/gj.v5i2.16125.
L. M. Sausan, Desty Mayang Pratiwi, “Perbandingan Metode Decision Tree Classifier dan XGBoost Classifier Dalam Memprediksi Penyakit Jantung,” CENTIVE, vol. 4, pp. 991–1000, 2024, [Online]. Available: https://conferences.ittelkom-pwt.ac.id/index.php/centive/article/download/336/303
I. U. W. Mulyono, E. H. Rachmawanto, C. A. Sari, and M. K. Sarker, “A high accuracy of deep learning based CNN architecture: classic, VGGNet, and RestNet50 for Covid-19 image classification,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 22, no. 5, pp. 1187–1195, 2024, doi: 10.12928/TELKOMNIKA.v22i5.26017.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling
Pages: 1153-1163
Copyright (c) 2025 Yoga Yoga, Fajri Rakhmat Umbara, Puspita Nurul Sabrina

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















