Enhancing Fraud Detection in Financial Services Using Artificial Intelligence
Downloads
As fraudulent activities become more advanced and rule-based systems reach their limitations, detecting financial fraud puts a big strain on financial institutions. In this thesis, we use Sparkov data and AI to see how effective it is at spotting fraudulent credit card activity, given that 0.57% of the transactions are fraudulent. The study tests Logistic Regression, Random Forest, XGBoost, Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) against a rule-based baseline, checking their accuracy, precision, recall, F1-score and ROC-AUC. The process goes from preprocessing with SMOTE, developing the features, choosing proper hyperparameters and concluding with thorough evaluation. They show that XGBoost and Logistic Regression achieve the best results, reaching recalls of 0.8993 and 0.8844 and ROC-AUC scores of 0.9656 and 0.9624, which beat the baseline of recall (0.6000) and ROC-AUC (0.7500). Yet, the low precision observed suggests there are a lot of false positives, which impacts how the algorithm is used. Although CNN and LSTM score well with precision (0.9699 and 1.0) and recall (0.0751 and 0.0061), since recall is low, both methods cannot be used. It was found that city, amount, job, category and gender have a high impact, which could lead to bias problems in the system. The study advises that XGBoost should be applied for fraud detection and outlines how to improve the model’s results. What they do is increase the effectiveness of fraud detection and give financial institutions a solid structure to address economic, ethical and regulatory factors.
Downloads
1 Abdulsalam, T.A. and Tajudeen, R.B. (2024). Artificial Intelligence (AI) in the Banking Industry: A Review of Service Areas and Customer Service Journeys in Emerging Economies. Business & Management Compass, 68(3), pp.19–43. doi:https://doi.org/10.56065/9hfvrq20.
2 ACFE (2024). ACFE Report to the Nations: Organizations Lost an Average of More Than $1.5M Per Fraud Case. [online] www.acfe.com. Available at: https://www.acfe.com/about-the-acfe/newsroom-for-media/press-releases/press-release-detail?s=2024-Report-to-the-Nations [Accessed 28 May 2025].
3 Adhikari, P., Hamal, P. and Jnr, F.B. (2024). Artificial Intelligence in fraud detection: Revolutionizing financial security. International Journal of Science and Research Archive, 13(1), pp.1457–1472. doi:https://doi.org/10.30574/ijsra.2024.13.1.1860.
4 Afjal, M., Salamzadeh, A. and Dana, L.-P. (2023). Financial Fraud and Credit Risk: Illicit Practices and Their Impact on Banking Stability. Journal of Risk and Financial Management, [online] 16(9), p.386. doi:https://doi.org/10.3390/jrfm16090386.
5 Afriyie, J.K., Tawiah, K., Pels, W.A., Addai-Henne, S., Dwamena, H.A., Owiredu, E.O., Ayeh, S.A. and Eshun, J. (2023). A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decision Analytics Journal, [online] 6(100163), p.100163. doi:https://doi.org/10.1016/j.dajour.2023.100163.
6 Agrawal, A., Gans, J. and Goldfarb, A. (2019). The Economics of Artificial Intelligence. [online] press.uchicago.edu. The University of Chicago Press. Available at: https://press.uchicago.edu/ucp/books/book/chicago/E/bo35780726.html.
7 Althnian, A., AlSaeed, D., Al-Baity, H., Samha, A., Dris, A.B., Alzakari, N., Abou Elwafa, A. and Kurdi, H. (2021). Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain. Applied Sciences, [online] 11(2), p.796. doi:https://doi.org/10.3390/app11020796.
8 Alzubaidi, L., Zhang, J., Humaidi, A.J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M.A., Al-Amidie, M. and Farhan, L. (2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, [online] 8(1). doi:https://doi.org/10.1186/s40537-021-00444-8.
9 Arner, D.W., Barberis, J.N. and Buckley, R.P. (2015). The Evolution of Fintech: a New Post-Crisis Paradigm? SSRN Electronic Journal, 47(4). doi:http://dx.doi.org/10.2139/ssrn.2676553.
10 Azeez, O.A., Ihechere, A.O. and Idemudia, C. (2024). Enhancing business performance: The role of data-driven analytics in strategic decision-making. International Journal of Management & Entrepreneurship Research, [online] 6(7), pp.2066–2081. doi:https://doi.org/10.51594/ijmer.v6i7.1257.
11 Bala, B.S., Yadav, P.P. and Reddy, M.R. (2024). An intelligent approach to detect and predict online fraud transaction using XGBoost algorithm. Indonesian Journal of Electrical Engineering and Computer Science, 35(3), pp.1491–1491. doi:https://doi.org/10.11591/ijeecs.v35.i3.pp1491-1498.
12 Balboa, A., Cuesta, A., González-Villa, J., Ortiz, G. and Alvear, D. (2024). Logistic regression vs machine learning to predict evacuation decisions in fire alarm situations. Safety science, 174, pp.106485–106485. doi:https://doi.org/10.1016/j.ssci.2024.106485.
13 Bello, A. and Olufemi, K. (2024). Artificial intelligence in fraud prevention: Exploring techniques and applications challenges and opportunities. Computer Science & IT Research Journal, [online] 5(6), pp.1505–1520. doi:https://doi.org/10.51594/csitrj.v5i6.1252.
14 Bolton, R.J. and Hand, D.J. (2002). Statistical Fraud Detection: A Review. Statistical Science, 17(3), pp.235–255. doi:https://doi.org/10.1214/ss/1042727940.
15 Boztepe, E. and Usul, H. (2019). Using the Analysis of Logistic Regression Model in Auditing and Detection of Frauds. Khazar Journal of Humanities and Social Sciences, 22(3), pp.5–23. doi:https://doi.org/10.5782/2223-2621.2019.22.3.5.
16 Breskuvienė, D. and Dzemyda, G. (2024). Enhancing credit card fraud detection: highly imbalanced data case. Journal of Big Data, 11(1). doi:https://doi.org/10.1186/s40537-024-01059-5.
17 Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, [online] 16(16), pp.321–357. doi:https://doi.org/10.1613/jair.953.
18 Chen, T. and Guestrin, C. (2016). XGBoost: a Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 1(1), pp.785–794. doi:https://doi.org/10.1145/2939672.2939785.
19 Deng, R.H., Bao, F. and Zhou, J. (2003). Information and Communications Security. Springer.
20 Dey, D., Haque, M.S., Islam, M.M., Aishi, U.I., Shammy, S.S., Mayen, A., Noor, A. and Uddin, M.J. (2025). The proper application of logistic regression model in complex survey data: a systematic review. BMC Medical Research Methodology, 25(1). doi:https://doi.org/10.1186/s12874-024-02454-5.
21 Efendi, R., Wahyono, T. and Widiasari, I.R. (2024). DBSCAN SMOTE LSTM: Effective Strategies for Distributed Denial of Service Detection in Imbalanced Network Environments. Big Data and Cognitive Computing, 8(9), pp.118–118. doi:https://doi.org/10.3390/bdcc8090118.
22 Farag, S. and Barakat, N. (2023). Data and Model Centric Approaches for Card Fraud Detection. International Conference on Computer and Applications. doi:https://doi.org/10.1109/icca59364.2023.10401839.
23 Fernandez, A., Garcia, S., Herrera, F. and Chawla, N.V. (2018). SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. Journal of Artificial Intelligence Research, 61, pp.863–905. doi:https://doi.org/10.1613/jair.1.11192.
24 Ferrara, E. (2023). Fairness and Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, and Mitigation Strategies. Sci, [online] 6(1), p.3. doi:https://doi.org/10.3390/sci6010003.
25 Flondor, E., Donath, L. and Neamtu, M. (2024). Automatic Card Fraud Detection Based on Decision Tree Algorithm. Applied Artificial Intelligence, 38(1). doi:https://doi.org/10.1080/08839514.2024.2385249.
26 Füller, J., Hutter, K., Wahl, J., Bilgram, V. and Tekic, Z. (2022). How AI revolutionizes innovation management – Perceptions and implementation preferences of AI-based innovators. Technological Forecasting and Social Change, 178(178), p.121598. doi:https://doi.org/10.1016/j.techfore.2022.121598.
27 Garcia-Segura, L.A. (2024). The Role of Artificial Intelligence in Preventing Corporate Crime. Journal of Economic Criminology, 5, pp.100091–100091. doi:https://doi.org/10.1016/j.jeconc.2024.100091.
28 GDPR (2013). Art. 15 GDPR – Right of access by the data subject | General Data Protection Regulation (GDPR). [online] General Data Protection Regulation (GDPR). Available at: https://gdpr-info.eu/art-15-gdpr/.
29 GDPR (2018). Art. 32 GDPR – Security of processing | General Data Protection Regulation (GDPR). [online] General Data Protection Regulation (GDPR). Available at: https://gdpr-info.eu/art-32-gdpr/ [Accessed 28 May 2025].
30 GeeksForGeeks (2018). Confusion Matrix in Machine Learning - GeeksforGeeks. [online] GeeksForGeeks. Available at: https://www.geeksforgeeks.org/confusion-matrix-machine-learning/ [Accessed 28 May 2025].
31 Gholami, M.F., Daneshgar, F., Beydoun, G. and Rabhi, F. (2017). Challenges in migrating legacy software systems to the cloud — an empirical study. Information Systems, 67, pp.100–113. doi:https://doi.org/10.1016/j.is.2017.03.008.
32 Goldsteen, A., Ezov, G., Shmelkin, R., Moffie, M. and Farkash, A. (2021). Data minimization for GDPR compliance in machine learning models. AI and Ethics. doi:https://doi.org/10.1007/s43681-021-00095-8.
33 Gopalan, N.R., Onniyil, N.D., Viswanathan, N.G. and Samdani, N.G. (2025). Hybrid models combining explainable AI and traditional machine learning: A review of methods and applications. World Journal of Advanced Engineering Technology and Sciences, 15(2), pp.1388–1402. doi:https://doi.org/10.30574/wjaets.2025.15.2.0635.
34 Hafez, I.Y., Hafez, A.Y., Saleh, A., El-Mageed, A.A.A. and Abohany, A.A. (2025). A systematic review of AI-enhanced techniques in credit card fraud detection. Journal Of Big Data, 12(1). doi:https://doi.org/10.1186/s40537-024-01048-8.
35 Hanna, M., Pantanowitz, L., Jackson, B., Palmer, O., Visweswaran, S., Pantanowitz, J., Deebajah, M. and Rashidi, H. (2024). Ethical and bias considerations in artificial intelligence/machine learning. Modern Pathology, [online] 38(3), pp.1–13. doi:https://doi.org/10.1016/j.modpat.2024.100686.
36 Hilal, W., Gadsden, S.A. and Yawney, J. (2021). A Review of Anomaly Detection Techniques and Applications in Financial Fraud. Expert Systems with Applications, [online] 193(1), p.116429. Available at: https://www.sciencedirect.com/science/article/pii/S0957417421017164.
37 Houssiau, F., Cohen, S.N., Szpruch, L., Daniel, O., Lawrence, M.G., Mitra, R., Wilde, H. and Mole, C. (2022). A Framework for Auditable Synthetic Data Generation. arXiv (Cornell University). doi:https://doi.org/10.48550/arxiv.2211.11540.
38 Imani, M., Beikmohammadi, A. and Arabnia, H.R. (2025). Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under Varying Imbalance Levels. Technologies, [online] 13(3), p.88. doi:https://doi.org/10.3390/technologies13030088.
39 Islam, T., Islam, M., Sarkar, A., Rahman, O., Paul, R. and Bari, S. (2024). Artificial Intelligence in Fraud Detection and Financial Risk Mitigation: Future Directions and Business Applications. International Journal For Multidisciplinary Research, [online] 6(5). doi:https://doi.org/10.36948/ijfmr.2024.v06i05.28496.
40 Jemai, J., Zarrad, A. and Daud, A. (2024). Identifying Fraudulent Credit Card Transactions using Ensemble Learning. IEEE access, pp.1–1. doi:https://doi.org/10.1109/access.2024.3380823.
41 Khan, F.S., Mazhar, S.S., Mazhar, K., AlSaleh, D.A. and Mazhar, A. (2025). Model-agnostic explainable artificial intelligence methods in finance: a systematic review, recent developments, limitations, challenges and future directions. Artificial Intelligence Review, 58(8). doi:https://doi.org/10.1007/s10462-025-11215-9.
42 Lei, S., Xu, K., Huang, Y. and Sha, X. (2020). An Xgboost based system for financial fraud detection. E3S Web of Conferences, 214, p.02042. doi:https://doi.org/10.1051/e3sconf/202021402042.
43 Lundberg, S. and Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. arXiv:1705.07874 [cs, stat]. [online] Available at: https://arxiv.org/abs/1705.07874.
44 Marr, B. (2018). The Amazing Ways How Mastercard Uses Artificial Intelligence To Stop Fraud And Reduce False Declines. [online] Forbes. Available at: https://www.forbes.com/sites/bernardmarr/2018/11/30/the-amazing-ways-how-mastercard-uses-artificial-intelligence-to-stop-fraud-and-reduce-false-declines/ [Accessed 28 May 2025].
45 Matharaarachchi, S., Domaratzki, M. and Muthukumarana, S. (2024). Enhancing SMOTE for imbalanced data with abnormal minority instances. Machine Learning with Applications, 18, p.100597. doi:https://doi.org/10.1016/j.mlwa.2024.100597.
46 Mennella, C., Maniscalco, U., Pietro, G.D. and Esposito, M. (2024). Ethical and regulatory challenges of AI technologies in healthcare: A narrative review. Heliyon, 10(4), pp.e26297–e26297. doi:https://doi.org/10.1016/j.heliyon.2024.e26297.
47 Metibemu, O.C. (2025). Financial Risk Management in Digital-Only Banks: Addressing Fraud and Cybersecurity Threats in a Cashless Economy. Asian Journal of Research in Computer Science, 18(3), pp.434–455. doi:https://doi.org/10.9734/ajrcos/2025/v18i3603.
48 Mienye, E., Jere, N., Obaido, G., Mienye, I.D. and Aruleba, K. (2024). Deep Learning in Finance: A Survey of Applications and Techniques. AI, 5(4), pp.2066–2091. doi:https://doi.org/10.3390/ai5040101.
49 Montavon, G., Samek, W. and Müller, K.-R. (2018). Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 73, pp.1–15. doi:https://doi.org/10.1016/j.dsp.2017.10.011.
50 Mqadi, N., Naicker, N. and Adeliyi, T. (2021). A SMOTe based Oversampling Data-Point Approach to Solving the Credit Card Data Imbalance Problem in Financial Fraud Detection. International Journal of Computing and Digital Systems, 10(1), pp.277–286. doi:https://doi.org/10.12785/ijcds/100128.
51 Nayak, H.D., Deekshita, Anvitha, L., Shetty, A., D’Souza, D.C. and Abraham, M.T. (2021). Fraud Detection in Online Transactions Using Machine Learning Approaches—A Review. Advances in Intelligent Systems and Computing. doi:https://doi.org/10.1007/978-981-15-3514-7_45.
52 Ngai, E.W.T., Hu, Y., Wong, Y.H., Chen, Y. and Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, [online] 50(3), pp.559–569. doi:https://doi.org/10.1016/j.dss.2010.08.006.
53 Nguyen, V., Kawazoe, Y., Wakabayashi, T., Pal, U. and Blumenstein, M. (2010). Performance Analysis of the Gradient Feature and the Modified Direction Feature for Off-line Signature Verification. Griffith Research Online (Griffith University), pp.303–307. doi:https://doi.org/10.1109/icfhr.2010.53.
54 Nirmalraj, S., Antony, M., Srideviponmalar, P., Oliver, A., Velmurugan, K., Assegie, T.A. and Nagarajan, G. (2023). Permutation feature importance-based fusion techniques for diabetes prediction. Soft Computing. doi:https://doi.org/10.1007/s00500-023-08041-y.
55 Olawade, D.B., Wada, O.Z., Ige, A.O., Egbewole, B.I., Olojo, A. and Oladapo, B.I. (2024). Artificial Intelligence in Environmental Monitoring: Advancements, Challenges, and Future Directions. Hygiene and Environmental Health Advances, [online] 12, pp.100114–100114. doi:https://doi.org/10.1016/j.heha.2024.100114.
56 Ouyang, Q., Lv, Y., Ma, J. and Li, J. (2020). An LSTM-Based Method Considering History and Real-Time Data for Passenger Flow Prediction. Applied Sciences, 10(11), p.3788. doi:https://doi.org/10.3390/app10113788.
57 Pagano, T.P., Loureiro, R.B., Lisboa, F.V.N., Peixoto, R.M., Guimarães, G.A.S., Cruz, G.O.R., Araujo, M.M., Santos, L.L., Cruz, M.A.S., Oliveira, E.L.S., Winkler, I. and Nascimento, E.G.S. (2023). Bias and Unfairness in Machine Learning Models: A Systematic Review on Datasets, Tools, Fairness Metrics, and Identification and Mitigation Methods. Big Data and Cognitive Computing, [online] 7(1), p.15. doi:https://doi.org/10.3390/bdcc7010015.
58 Pan, E. (2024). Machine Learning in Financial Transaction Fraud Detection and Prevention. Transactions on Economics, Business and Management Research, [online] 5, pp.243–249. doi:https://doi.org/10.62051/16r3aa10.
59 Paul, A.A. and Ogburie, C. (2025). The Role of AI in preventing financial fraud and enhancing compliance. GSC Advanced Research and Reviews, [online] 22(3), pp.269–282. doi:https://doi.org/10.30574/gscarr.2025.22.3.0086.
60 PayPal (2023). 4 Ways Machine Learning Helps You Detect Payment Fraud. [online] www.paypal.com. Available at: https://www.paypal.com/us/brc/article/payment-fraud-detection-machine-learning [Accessed 28 May 2025].
61 Pillai, P. (2025). A Deep Learning Based Hybrid Model Using LSTM and CNN Techniques for Automated Internal Fraud Detection in Banking Systems. Journal of Information Systems Engineering & Management, 10(40s), pp.674–686. doi:https://doi.org/10.52783/jisem.v10i40s.7468.
62 Pokotylo, P. (2024). Ethical and Legal Considerations of Synthetic Data Usage | Keymakr. [online] Keymakr. Available at: https://keymakr.com/blog/ethical-and-legal-considerations-of-synthetic-data-usage/ [Accessed 28 May 2025].
63 Ribeiro, M.T., Singh, S. and Guestrin, C. (2016). ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, [online] pp.1135–1144. doi:https://doi.org/10.1145/2939672.2939778.
64 Saad Hussein, A., Li, T., Chubato, W.Y. and Bashir, K. (2019). A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE. International Journal of Computational Intelligence Systems. doi:https://doi.org/10.2991/ijcis.d.191114.002.
65 Shenoy, K. (2019). Credit Card Transactions Fraud Detection Dataset. [online] Kaggle.com. Available at: https://www.kaggle.com/datasets/kartik2112/fraud-detection [Accessed 28 May 2025].
66 Sruthi (2021). Random Forest | Introduction to Random Forest Algorithm. [online] Analytics Vidhya. Available at: https://www.analyticsvidhya.com/blog/2021/06/understanding-random-forest/ [Accessed 28 May 2025].
67 Sun, Z., Wang, G., Li, P., Wang, H., Zhang, M. and Liang, X. (2024). An improved random forest based on the classification accuracy and correlation measurement of decision trees. Expert Systems with Applications, [online] 237, p.121549. doi:https://doi.org/10.1016/j.eswa.2023.121549.
68 Tayebi, M. and El Kafhali, S. (2025). A Novel Approach based on XGBoost Classifier and Bayesian Optimization for Credit Card Fraud Detection. Cyber Security and Applications, p.100093. doi:https://doi.org/10.1016/j.csa.2025.100093.
69 Tayebi, M. and Said, E.K. (2025). Generative Modeling for Imbalanced Credit Card Fraud Transaction Detection. Journal of Cybersecurity and Privacy, [online] 5(1), p.9. doi:https://doi.org/10.3390/jcp5010009.
70 Team FOCAL (2025). What to Expect from Bank Fraud Investigations in 2025. [online] Getfocal.ai. Available at: https://www.getfocal.ai/blog/bank-fraud-investigation [Accessed 28 May 2025].
71 Tempel, F., Ihlen, E.A.F., Adde, L. and Strümke, I. (2025). Explaining Human Activity Recognition with SHAP: Validating insights with perturbation and quantitative measures. Computers in Biology and Medicine, 188, p.109838. doi:https://doi.org/10.1016/j.compbiomed.2025.109838.
72 Ujang Riswanto (2025). Building a Fraud Detection Model Using Logistic Regression in R. [online] Medium. Available at: https://ujangriswanto08.medium.com/building-a-fraud-detection-model-using-logistic-regression-in-r-0917e2d46b6d [Accessed 28 May 2025].
73 Valind, N. (2022). GDPR, PSD2, and Open Banking: Navigating Regulatory Waters. [online] Konsentus. Available at: https://www.konsentus.com/insights/articles/gdpr-psd2-and-open-banking/ [Accessed 28 May 2025].
74 Varsha, P.S. (2023). How can we manage biases in artificial intelligence systems – A systematic literature review. International Journal of Information Management Data Insights, [online] 3(1), p.100165. doi:https://doi.org/10.1016/j.jjimei.2023.100165.
75 Vasant, M., Ganesan, S. and Kumar, G. (2025). Enhancing E-commerce Security: A Hybrid Machine Learning Approach to Fraud Detection. FinTech and Sustainable Innovation. doi:https://doi.org/10.47852/bonviewfsi52024882.
76 Whitrow, C., Hand, D.J., Juszczak, P., Weston, D. and Adams, N.M. (2008). Transaction aggregation as a strategy for credit card fraud detection. Data Mining and Knowledge Discovery, 18(1), pp.30–55. doi:https://doi.org/10.1007/s10618-008-0116-z.
77 Wu, P. and Chen, Y. (2024). Enhanced detection of accounting fraud using a CNN-LSTM-Attention model optimized by Sparrow search. PeerJ Computer Science, 10, p.e2532. doi:https://doi.org/10.7717/peerj-cs.2532.
78 Wu, Y., Wang, L., Li, H. and Liu, J. (2025). A Deep Learning Method of Credit Card Fraud Detection Based on Continuous-Coupled Neural Networks. Mathematics, [online] 13(5), pp.819–819. doi:https://doi.org/10.3390/math13050819.
79 Zhang, Z., Zhou, X., Zhang, X., Wang, L. and Wang, P. (2018). A Model Based on Convolutional Neural Network for Online Transaction Fraud Detection. Security and Communication Networks, 2018, pp.1–9. doi:https://doi.org/10.1155/2018/5680264.
Copyright (c) 2025 Shahbaj Ahmad

This work is licensed under a Creative Commons Attribution 4.0 International License.