Machine Learning for Enhanced Churn Prediction in Banking: Leveraging Oversampling and Stacking Techniques

Customer churn Prediction Machine Learning random oversampling Ensemble Model oversampling stacking model k-fold validation

Authors

  • Omar Faruq Department of Computer Science and Engineering East West University, Dhaka-1212, , Bangladesh
  • Fahad Ahammed Department of Computer Science and Engineering East West University, Dhaka-1212, , Bangladesh
  • Arifa Sultana Mily Department of Computer Science and Engineering East West University, Dhaka-1212, , Bangladesh
  • 4Ashraful Islam Department of Computer Science and Engineering East West University, Dhaka-1212, , Bangladesh
Vol. 12 No. 09 (2024)
Engineering and Computer Science
September 13, 2024

Downloads

Every sector of business is getting more competitive as time passes. More and more companies are offering services to people. Banking sector is no different. With the plethora option customer has in terms banking, holding on to customer may prove difficult for banks. This research will help banks to predict which customers are likely going to churn and allow them to take precaution to stop customers from leaving. In this study we have used classifiers such as: K-Neighbors, Random Forest, XGboost, Adaboost classifiers and Ensemble Model (Stacking Technique) that uses all of these models together. The experimentation was conducted on a dataset from Kaggle. The dataset used in this research was heavily imbalanced. So, different oversampling methods like Random Oversampling and SMOTE-ENN have been used. In data preprocessing, label encoding was done and for validation K-folding technique (k-5) have been used. The highest accuracy has been achieved by using Random Oversampling with Stacking Model which is 97.31% (std: 0.0033, k=5).