Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling


  • Yoga Yoga * Mail Universitas Jenderal Achmad Yani, Cimahi, Indonesia
  • Fajri Rakhmat Umbara Universitas Jenderal Achmad Yani, Cimahi, Indonesia
  • Puspita Nurul Sabrina Universitas Jenderal Achmad Yani, Cimahi, Indonesia
  • (*) Corresponding Author
Keywords: Website Phishing; X-gboost; Radial Based Undersampling; Random Search; Data Imbalance

Abstract

Phishing websites are one of the most prevalent forms of cyberattacks and have the potential to cause significant losses, both financially and non-financially. Automatic phishing detection using machine learning algorithms has become an effective solution to address this threat. This study aims to classify phishing websites using the Extreme Gradient Boosting (XGBoost) algorithm and to address the issue of class imbalance by applying the Radial Based Undersampling (RBU) method. In addition, hyperparameter tuning was performed using the Random Search method to optimize the model's performance. The dataset used was obtained from the Kaggle platform and exhibits an imbalanced class distribution, where the number of non-phishing instances far exceeds phishing instances. This imbalance can lead to a biased model and reduce its ability to detect minority class patterns. Based on the evaluation results, the application of RBU significantly improved the model’s capability in detecting phishing instances, while hyperparameter tuning further enhanced its accuracy. The best model was achieved through a combination of RBU and Random Search, reaching an accuracy of 90.39% on the test data. These findings indicate that the combined approach of data balancing and model optimization provides an effective solution for phishing website classification and can be applied to similar cases in the field of cybersecurity.

Downloads

Download data is not yet available.

References

V. A. Windarni, A. F. Nugraha, S. T. A. Ramadhani, D. A. Istiqomah, F. M. Puri, and A. Setiawan, “Deteksi Website Phishing Menggunakan Teknik Filter Pada Model Machine Learning,” Inf. Syst. J., vol. 6, no. 01, pp. 39–43, 2023, doi: 10.24076/infosjournal.2023v6i01.1268.

A. Raihan, M. Fadhli, T. Engineering, and P. N. Sriwijaya, “Implementation Of Deep Learning For Detecting Phishing Attacks On Websites With Combination Of Cnn And Lstm Implementasi Deep Learning Dalam Mendeteksi Serangan,” J. Tek. Inform., vol. 5, no. 5, pp. 1451–1459, 2024, doi: 10.52436/1.jutif.2024.5.5.2446.

T. F. Handoyo, M. Pajar, and K. Putra, “Optimasi Bobot Kelas LSTM untuk Deteksi URL Phishing pada Dataset Tidak Berimbang,” J. Inform. J. Pengemb. IT, vol. 10, no. 1, pp. 20–36, 2025, doi: 10.30591/jpit.v10i1.8128.

M. Erkamim, S. Suswadi, M. Z. Subarkah, and E. Widarti, “Komparasi Algoritme Random Forest dan XGBoosting dalam Klasifikasi Performa UMKM,” J. Sist. Inf. Bisnis, vol. 13, no. 2, pp. 127–134, 2023, doi: 10.21456/vol13iss2pp127-134.

M. W. Dwinanda, N. Satyahadewi, and W. Andani, “Classification of Student Graduation Status Using Xgboost Algorithm,” BAREKENG J. Ilmu Mat. dan Terap., vol. 17, no. 3, pp. 1785–1794, 2023, doi: 10.30598/barekengvol17iss3pp1785-1794.

L. Wulandari, “Optimisasi Algoritma Xgboost Untuk Prediksi Hasil Pemilu,” J. Dunia Data, vol. 1, no. 5, pp. 1–16, 2024, [Online]. Available: http://www.portaldata.org/index.php/duniadata/article/view/100

A. Syukron, S. Sardiarinto, E. Saputro, and P. Widodo, “Penerapan Metode Smote Untuk Mengatasi Ketidakseimbangan Kelas Pada Prediksi Gagal Jantung,” J. Teknol. Inf. dan Terap., vol. 10, no. 1, pp. 47–50, 2023, doi: 10.25047/jtit.v10i1.313.

M. Kavitha, “Comparative Analysis of SMOTE Techniques and Machine Learning Models for Imbalanced Medical Datasets,” IEEE Conf. Proc., June, 2024

K. Omari and A. Oukhatar, “Advanced Phishing Website Detection with SMOTETomek-XGB: Addressing Class Imbalance for Optimal Results,” Procedia Comput. Sci., vol. 252, pp. 289–295, 2025, doi: 10.1016/j.procs.2024.12.031.

M. Adhikari and S. Pandey, “A Comparative Analysis of Support Vector Machines , Decision Trees , and Long Short-Term Memory Networks in Phishing Website Detection,” Int. J. Res. Publ., vol. 159, no. 1, pp. 190–199, 2024, doi: 10.47119/IJRP10015911020247261.

A. Kharis Pratama, H. Ashaury, and F. Rakhmat Umbara, “Klasifikasi Data Gempa Bumi Di Pulau Jawa Menggunakan Algoritma Extreme Gradient Boosting,” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 4, pp. 2923–2929, 2024, doi: 10.36040/jati.v7i4.7296.

D. Kurnia, M. Itqan Mazdadi, D. Kartini, R. Adi Nugroho, and F. Abadi, “Seleksi Fitur dengan Particle Swarm Optimization pada Klasifikasi Penyakit Parkinson Menggunakan XGBoost,” J. Teknol. Inf. dan Ilmu Komput., vol. 10, no. 5, pp. 1083–1094, 2023, doi: 10.25126/jtiik.20231057252.

M. Koziarski, “Radial-Based Undersampling Algorithm for Classification of Breast Cancer Histopathological Images Affected by Data Imbalance,” Pattern Recognit., no. 1, pp. 2–6, 2019, doi: 10.1016/j.patcog.2020.107262.

M. Dava Maulana, A. Id Hadiana, and F. Rakhmat Umbara, “Algoritma Xgboost Untuk Klasifikasi Kualitas Air Minum,” JATI (Jurnal Mhs. Tek. Inform.), vol. 7, no. 5, pp. 3251–3256, 2024, doi: 10.36040/jati.v7i5.7308.

A. C. Nugraha and M. I. Irawan, “Komparasi Deteksi Kecurangan pada Data Klaim Asuransi Pelayanan Kesehatan Menggunakan Metode Support Vector Machine (SVM) dan Extreme Gradient Boosting (XGBoost),” J. Sains dan Seni ITS, vol. 12, no. 1, 2023, doi: 10.12962/j23373520.v12i1.107032.

L. Zhang, W. Bian, W. Qu, L. Tuo, and Y. Wang, “Time series forecast of sales volume based on XGBoost,” J. Phys. Conf. Ser., vol. 1873, no. 1, 2021, doi: 10.1088/1742-6596/1873/1/012067.

Y. Rombe, S. A. Thamrin, and A. Lawi, “Application of Adaptive Synthetic Nominal and Extreme Gradient Boosting Methods in Determining Factors Affecting Obesity: A Case Study of Indonesian Basic Health Research Survey 2013,” Indones. J. Stat. Its Appl., vol. 6, no. 2, pp. 309–317, 2022, doi: 10.29244/ijsa.v6i2p309-317.

S. Fatika, N. Halim, and D. Aktuaria, “Analisis Perbandingan Klasifikasi dan Penerapan SMOTE Dalam Imbalanced Data pada Credit Card Default,” J. Sains dan Seni ITS 12(2), vol. 12, no. 2, 2023, doi: 10.12962/j23373520.v12i2.111833.

M. Fajri and A. Primajaya, “Komparasi Teknik Hyperparameter Optimization pada SVM untuk Permasalahan Klasifikasi dengan Menggunakan Grid Search dan Random Search,” J. Appl. Informatics Comput., vol. 7, no. 1, pp. 14–19, 2023, doi: 10.30871/jaic.v7i1.5004.

Euis Saraswati, Yuyun Umaidah, and Apriade Voutama, “Penerapan Algoritma Artificial Neural Network untuk Klasifikasi Opini Publik Terhadap Covid-19,” Gener. J., vol. 5, no. 2, pp. 109–118, 2021, doi: 10.29407/gj.v5i2.16125.

L. M. Sausan, Desty Mayang Pratiwi, “Perbandingan Metode Decision Tree Classifier dan XGBoost Classifier Dalam Memprediksi Penyakit Jantung,” CENTIVE, vol. 4, pp. 991–1000, 2024, [Online]. Available: https://conferences.ittelkom-pwt.ac.id/index.php/centive/article/download/336/303

I. U. W. Mulyono, E. H. Rachmawanto, C. A. Sari, and M. K. Sarker, “A high accuracy of deep learning based CNN architecture: classic, VGGNet, and RestNet50 for Covid-19 image classification,” Telkomnika (Telecommunication Comput. Electron. Control., vol. 22, no. 5, pp. 1187–1195, 2024, doi: 10.12928/TELKOMNIKA.v22i5.26017.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling

Dimensions Badge
Article History
Submitted: 2025-07-08
Published: 2025-09-02
Abstract View: 506 times
PDF Download: 306 times
How to Cite
Yoga, Y., Umbara, F., & Sabrina, P. (2025). Klasifikasi Website Phishing Menggunakan Metode X-Gboost dengan Teknik Penyeimbang Data Radial Based Undersampling. Building of Informatics, Technology and Science (BITS), 7(2), 1153-1163. https://doi.org/10.47065/bits.v7i2.7920
Section
Articles