Application of Machine Learning and Resampling Techniques to Credit Card Fraud Detection
Keywords:
Machine learning, Fraud detection, Random forest, Resampling techniques, XGBoost, TensorFlow, Deep neural networkAbstract
The application of machine learning algorithms to the detection of fraudulent credit card transactions is a challenging problem domain due to the high imbalance in the datasets and confidentiality of financial data. This implies that legitimate transactions make up a high majority of the datasets such that a weak model with 99% accuracy and faulty predictions may still be assessed as high-performing. To build optimal models, four techniques were used in this research to sample the datasets including the baseline train test split method, the class weighted hyperparameter approach, and the undersampling and oversampling techniques. Three machine learning algorithms were implemented for the development of the models including the Random Forest, XGBoost and TensorFlow Deep Neural Network (DNN). Our observation is that the DNN is more effcient than the other 2 algorithms in modelling the under-sampled dataset while overall, the three algorithms had a better performance in the oversampling technique than in the undersampling technique. However, the Random Forest performed better than the other algorithms in the baseline approach. After comparing our results with some existing state-of-the-art works, we achieved an improved performance using real-world datasets.
Published
How to Cite
Issue
Section
Copyright (c) 2022 Chinedu L. Udeze, Idongesit E. Eteng, Ayei E. Ibor
This work is licensed under a Creative Commons Attribution 4.0 International License.