Performa Random Forest dan XGBoost pada Deteksi Penipuan E-Commerce Menggunakan Augmentasi Data CGAN


  • Sarmini Sarmini * Mail Universitas Ahmad Dahlan, Yogyakarta, Indonesia
  • Sunardi Sunardi Universitas Ahmad Dahlan, Yogyakarta, Indonesia
  • Abdul Fadlil Universitas Ahmad Dahlan, Yogyakarta, Indonesia
  • (*) Corresponding Author
Keywords: Data Augmentation; CGAN; Fraud Detection; E-commerce; Machine Learning

Abstract

Fraud detection in e-commerce faces great challenges due to data imbalance, where legitimate transactions far outnumber fraudulent transactions. This research explores the use of Conditional Generative Adversarial Network (CGAN) to generate synthetic fraudulent transaction data to address the imbalance problem. By increasing the amount of data in the minority class, this research aims to improve the performance of two widely used machine learning algorithms, namely Random Forest and XGBoost. The dataset used of 23,634 transactions with 22,412 non-fraud transactions and 1,222 fraudulent transactions. Accuracy, precision, recall, and F1-score metrics were conducted to assess the performance of the model in detecting fraud on the imbalanced and augmented datasets. The results show that augmentation of data with CGAN significantly improves the performance of both models, especially in improving recall for fraudulent transactions. On the original unbalanced dataset, Random Forest and XGBoost showed low recall (12.81% and 13.08%), with accuracy of 95.35% and 95.32% respectively. However, after augmentation, recall improved to 95.15% for Random Forest and 95.22% for XGBoost, with F1-score of 97.47% and 97.42% respectively, and accuracy of 97.50% for Random Forest and 97.42% for XGBoost. XGBoost showed a slight advantage in precision and recall over Random Forest, especially on the augmented dataset. These findings confirm the effectiveness of CGAN as a data augmentation method in improving fraud detection performance and offer a robust solution to address data imbalance in the financial sector.

Downloads

Download data is not yet available.

References

A. Saputra and Suharjito, “Fraud detection using machine learning in e-commerce,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 9, pp. 332–339, 2019, doi: 10.14569/ijacsa.2019.0100943.

S. Matharaarachchi, M. Domaratzki, and S. Muthukumarana, “Machine Learning with Applications Enhancing SMOTE for imbalanced data with abnormal minority instances,” Mach. Learn. with Appl., vol. 18, no. September, p. 100597, 2024, doi: 10.1016/j.mlwa.2024.100597.

I. de Zarzà, J. de Curtò, and C. T. Calafate, “Optimizing Neural Networks for Imbalanced Data,” Electron., vol. 12, no. 12, pp. 1–26, 2023, doi: 10.3390/electronics12122674.

T. Karthikeyan, M. Govindarajan, and V. Vijayakumar, “An effective fraud detection using competitive swarm optimization based deep neural network,” Meas. Sensors, vol. 27, no. December 2022, p. 100793, 2023, doi: 10.1016/j.measen.2023.100793.

H. A. Gameng, B. D. Gerardo, and R. P. Medina, “A modified adaptive synthetic smote approach in graduation success rate classification,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 8, no. 6, pp. 3053–3057, 2019, doi: 10.30534/ijatcse/2019/63862019.

A. Mutemi and F. Bacao, “E-Commerce Fraud Detection Based on Machine Learning Techniques: Systematic Literature Review,” Big Data Min. Anal., vol. 7, no. 2, pp. 419–444, 2024, doi: 10.26599/BDMA.2023.9020023.

A. Cherif, H. Ammar, M. Kalkatawi, S. Alshehri, and A. Imine, “Encoder–decoder graph neural network for credit card fraud detection,” J. King Saud Univ. - Comput. Inf. Sci., vol. 36, no. 3, p. 102003, 2024, doi: 10.1016/j.jksuci.2024.102003.

R. Damayanti and Z. Adrianto, “Machine Learning For E-Commerce Fraud Detection,” J. Ris. Akunt. Dan Bisnis Airlangga, vol. 8, no. 2, pp. 1562–1577, Nov. 2023, doi: 10.20473/jraba.v8i2.48559.

E. Khan, M. Zia Ur Rehman, F. Ahmed, S. A. Alsuhibany, M. Zulfiqar Ali, and J. Ahmad, “An Automated Classification Technique for COVID-19 Using Optimized Deep Learning Features,” Comput. Syst. Sci. Eng., vol. 46, no. 3, pp. 3799–3814, 2023, doi: 10.32604/csse.2023.037131.

B. Lebichot, T. Verhelst, Y.-A. Le Borgne, L. He-Guelton, F. Oble, and G. Bontempi, “Transfer Learning Strategies for Credit Card Fraud Detection,” IEEE Access, vol. 9, pp. 114754–114766, 2021, doi: 10.1109/ACCESS.2021.3104472.

D. Sisodia and D. S. Sisodia, “A transfer learning framework towards identifying behavioral changes of fraudulent publishers in pay-per-click model of online advertising for click fraud detection,” Expert Syst. Appl., vol. 232, p. 120922, Dec. 2023, doi: 10.1016/j.eswa.2023.120922.

M. Azim Mim, N. Majadi, and P. Mazumder, “A soft voting ensemble learning approach for credit card fraud detection,” Heliyon, vol. 10, no. 3, p. e25466, Feb. 2024, doi: 10.1016/j.heliyon.2024.e25466.

F. K. Alarfaj, I. Malik, H. U. Khan, N. Almusallam, M. Ramzan, and M. Ahmed, “Credit Card Fraud Detection Using State-of-the-Art Machine Learning and Deep Learning Algorithms,” IEEE Access, vol. 10, pp. 39700–39715, 2022, doi: 10.1109/ACCESS.2022.3166891.

Y. Bing Chu, Z. Min Lim, B. Keane, P. Hao Kong, A. Rafat Elkilany, and O. Hisham Abusetta, “Credit Card Fraud Detection on Original European Credit Card Holder Dataset Using Ensemble Machine Learning Technique,” J. Cyber Secur., vol. 5, no. 0, pp. 33–46, 2023, doi: 10.32604/jcs.2023.045422.

N. Mqadi, N. Naicker, and T. Adeliyi, “A SMOTe based Oversampling Data-Point Approach to Solving the Credit Card Data Imbalance Problem in Financial Fraud Detection,” Int. J. Comput. Digit. Syst., vol. 10, no. 1, pp. 277–286, Feb. 2021, doi: 10.12785/ijcds/100128.

R. Bounab, K. Zarour, B. Guelib, and N. Khlifa, “Enhancing Medicare Fraud Detection Through Machine Learning: Addressing Class Imbalance With SMOTE-ENN,” IEEE Access, vol. 12, pp. 54382–54396, 2024, doi: 10.1109/ACCESS.2024.3385781.

J. Lee, D. Jung, J. Moon, and S. Rho, “Advanced R-GAN: Generating anomaly data for improved detection in imbalanced datasets using regularized generative adversarial networks,” Alexandria Eng. J., vol. 111, no. September 2024, pp. 491–510, 2025, doi: 10.1016/j.aej.2024.10.084.

M. J. Madhurya, H. L. Gururaj, B. C. Soundarya, K. P. Vidyashree, and A. B. Rajendra, “Exploratory analysis of credit card fraud detection using machine learning techniques,” Glob. Transitions Proc., vol. 3, no. 1, pp. 31–37, 2022, doi: 10.1016/j.gltp.2022.04.006.

M. Â. L. Moreira et al., “Exploratory analysis and implementation of machine learning techniques for predictive assessment of fraud in banking systems,” Procedia Comput. Sci., vol. 214, no. C, pp. 117–124, 2022, doi: 10.1016/j.procs.2022.11.156.

P. Gupta, A. Varshney, M. R. Khan, R. Ahmed, M. Shuaib, and S. Alam, “Unbalanced Credit Card Fraud Detection Data: A Machine Learning-Oriented Comparative Study of Balancing Techniques,” Procedia Comput. Sci., vol. 218, pp. 2575–2584, 2022, doi: 10.1016/j.procs.2023.01.231.

E. Esenogho, I. D. Mienye, T. G. Swart, K. Aruleba, and G. Obaido, “A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection,” IEEE Access, vol. 10, pp. 16400–16407, 2022, doi: 10.1109/ACCESS.2022.3148298.

J. K. Afriyie et al., “A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions,” Decis. Anal. J., vol. 6, no. December 2022, p. 100163, 2023, doi: 10.1016/j.dajour.2023.100163.

L. Cao and H. Shen, “Imbalanced data classification based on hybrid resampling and twin support vector machine,” Comput. Sci. Inf. Syst., vol. 14, no. 3, pp. 579–595, 2017, doi: 10.2298/CSIS161221017L.

H. R. Sneha and B. Annappa, “Exploratory Analysis of Methods, Techniques, and Metrics to Handle Class Imbalance Problem,” Procedia Comput. Sci., vol. 235, pp. 863–877, 2024, doi: 10.1016/j.procs.2024.04.082.

S. R. Byrapu Reddy, P. Kanagala, P. Ravichandran, D. R. Pulimamidi, P. V. Sivarambabu, and N. S. A. Polireddi, “Effective fraud detection in e-commerce: Leveraging machine learning and big data analytics,” Meas. Sensors, vol. 33, no. April, p. 101138, 2024, doi: 10.1016/j.measen.2024.101138.

Kanika, J. Singla, A. K. Bashir, Y. Nam, N. U. I. Hasan, and U. Tariq, “Handling class imbalance in online transaction fraud detection,” Comput. Mater. Contin., vol. 70, no. 2, pp. 2861–2877, 2022, doi: 10.32604/cmc.2022.019990.

S. Jagtap, Fraudulent E-Commerce Transactions, Kaggle, Apr. 2024. [Online]. Available: https://www.kaggle.com/datasets/shriyashjagtap/fraudulent-e-commerce-transactions.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Performa Random Forest dan XGBoost pada Deteksi Penipuan E-Commerce Menggunakan Augmentasi Data CGAN

Dimensions Badge
Article History
Submitted: 2024-12-10
Published: 2024-12-26
Abstract View: 53 times
PDF Download: 72 times
How to Cite
Sarmini, S., Sunardi, S., & Fadlil, A. (2024). Performa Random Forest dan XGBoost pada Deteksi Penipuan E-Commerce Menggunakan Augmentasi Data CGAN. Building of Informatics, Technology and Science (BITS), 6(3), 1919-1931. https://doi.org/10.47065/bits.v6i3.6430
Issue
Section
Articles