Downloads

Zhu, D., Gan, Y., & Chen, X. . (2021). Domain Adaptation-Based Machine Learning Framework for Customer Churn Prediction Across Varing Distributions. Journal of Computational Methods in Engineering Applications, 1(1), 1–14. https://doi.org/10.62836/jcmea.v1i1.010102

Domain Adaptation-Based Machine Learning Framework for Customer Churn Prediction Across Varing Distributions

In today’s fiercely competitive business environment, the ability to accurately predict customer churn is essential for enhancing customer retention and reducing financial losses. Traditional statistical approaches, although beneficial, often struggle to perform effectively across diverse customer data domains due to variability in data distributions. This research introduces a refined method for predicting customer churn that utilizes domain adaptation techniques to overcome these challenges. Specifically, it employs the Correlation Alignment (CORAL) method to synchronize the feature distributions between the source and target datasets, significantly improving the logistic regression model’s capacity to apply insights across various customer segments. The process involves segmenting the customer data into clearly defined clusters using the k-means algorithm, which helps pinpoint and adjust distributional discrepancies, thus boosting the model’s accuracy. Early results indicate that incorporating domain adaptation not only bolsters the model’s applicability across different domains but also drastically minimizes the covariance differences—from a substantial initial gap to nearly zero. This strategic approach demonstrates substantial potential to revolutionize how businesses anticipate and manage customer behavior, providing a more adaptable and effective framework for addressing the complex challenges of customer churn.

customer churn predicition; business analytics; domain adaptation

References

  1. Huang B, Kechadi MT, Buckley B. Customer Churn Prediction in Telecommunications. Expert Systems with Applications 2012; 39(1): 1414–1425. DOI: https://doi.org/10.1016/j.eswa.2011.08.024
  2. Çelik O, Osmanoglu UO. Comparing to Techniques Used in Customer Churn Analysis. Journal of Multidisciplinary Developments 2019; 4(1): 30–38.
  3. Qiu Y, Chen P, Lin Z, et al. Clustering Analysis for Silent Telecom Customers Based on k-Means++. In Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020; pp. 1023–1027. DOI: https://doi.org/10.1109/ITNEC48623.2020.9084976
  4. Xiong S, Yu L, Shen H, et al. Efficient Algorithms for Sensor Deployment and Routing in sensor Networks for Network-Structured Environment Monitoring. In Proceedings of the 2012 Proceedings IEEE INFOCOM, Orlando, FL, USA, 25–30 March 2012; pp. 1008–1016. DOI: https://doi.org/10.1109/INFCOM.2012.6195455
  5. Du S, Chen Z, Wu H, et al. Image Recommendation Algorithm Combined with Deep Neural Network Designed for Social Networks. Complexity 2021; 2021(1): 5196190. DOI: https://doi.org/10.1155/2021/5196190
  6. Wang X, Li H, Sun C, et al. Prediction of Mental Health in Medical Workers during COVID-19 Based on Machine Learning. Frontiers in Public Health 2021; 9: 697850. DOI: https://doi.org/10.3389/fpubh.2021.697850
  7. Farahani A, Voghoei S, Rasheed K, et al. A Brief Review of Domain Adaptation. In Advances in Data Science and Information Engineering: Proceedings from ICDATA 2020 and IKE 2020; Springer: Cham, Switzerland, 2020; pp. 877–894. DOI: https://doi.org/10.1007/978-3-030-71704-9_65
  8. Guan H, Liu M. Domain Adaptation for Medical Image Analysis: A Survey. IEEE Transactions on Biomedical Engineering 2021; 69(3): 1173–1185. DOI: https://doi.org/10.1109/TBME.2021.3117407
  9. Schneider S, Rusak E, Eck L, et al. Improving Robustness against Common Corruptions by Covariate Shift Adaptation. Advances in Neural Information Processing Systems 2020; 33: 11539–11551.
  10. Mehmood H, Kostakos P, Cortes M, et al. Concept Drift Adaptation Techniques in Distributed Environment for Real-World Data Streams. Smart Cities 2021; 4(1): 349–371. DOI: https://doi.org/10.3390/smartcities4010021
  11. Sun B, Saenko K. Deep Coral: Correlation Alignment for Deep Domain Adaptation. In Computer Vision–ECCV 2016 Workshops; Springer International Publishing: Amsterdam, The Netherlands, 2016; pp. 443–450. DOI: https://doi.org/10.1007/978-3-319-49409-8_35
  12. Verbeke W, Dejaeger K, Martens D, et al. New Insights into Churn Prediction in the Telecommunication Sector: A Profit Driven Data Mining Approach. European Journal of Operational Research 2012; 218(1): 211–229. DOI: https://doi.org/10.1016/j.ejor.2011.09.031
  13. De Caigny A, Coussement K, De Bock KW. A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees. European Journal of Operational Research 2018; 269(2): 760–772. DOI: https://doi.org/10.1016/j.ejor.2018.02.009
  14. Liu DS, Fan SJ. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem. The Scientific World Journal 2014; 2014(1): 468324. DOI: https://doi.org/10.1155/2014/468324
  15. Ahmad AK, Jafar A, Aljoumaa K. Customer Churn Prediction in Telecom Using Machine Learning in Big Data Platform. Journal of Big Data 2019; 6(1): 1–24. DOI: https://doi.org/10.1186/s40537-019-0191-6
  16. Brânduşoiu I, Toderean G, Beleiu H. Methods for Churn Prediction in the Pre-Paid Mobile Telecommunications Industry. In Proceedings of the 2016 International Conference on Communications (COMM), Bucharest, Romania, 9–10 June 2016; pp. 97–100. DOI: https://doi.org/10.1109/ICComm.2016.7528311
  17. Makhtar M, Nafis S, Mohamed MA, et al. Churn Classification Model for Local Telecommunication Company Based on Rough Set Theory. Journal of Fundamental and Applied Sciences 2017; 9(6): 854–868. DOI: https://doi.org/10.4314/jfas.v9i6s.64
  18. Bock K.W, Poel D. Reconciling Performance and Interpretability in Customer Churn Prediction Using Ensemble Learning Based on Generalized Additive Models. Expert Systems with Applications 2012; 39: 6816–6826. DOI: https://doi.org/10.1016/j.eswa.2012.01.014
  19. Idris A, Iftikhar A, Rehman Z.U. Intelligent churn prediction for telecom using GP-AdaBoost learning and PSO Undersampling. Cluster Computing 2017; 22: 7241–7255. DOI: https://doi.org/10.1007/s10586-017-1154-3
  20. Pan S.J, Yang Q. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 2010; 22(10): 1345–1359. DOI: https://doi.org/10.1109/TKDE.2009.191
  21. Gopalan R, Li R, Chellappa R. Domain Adaptation for Object Recognition: An Unsupervised Approach. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 999–1006. DOI: https://doi.org/10.1109/ICCV.2011.6126344
  22. Jhuo I.H, Liu D, Lee D.T, et al. Robust Visual Domain Adaptation with Low-Rank Reconstruction. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 2168–2175.
  23. Pan SJ, Tsang I.W, Kwok JT, et al. Domain Adaptation via Transfer Component Analysis. IEEE Transactions on Neural Networks 2010; 22(2): 199–210. DOI: https://doi.org/10.1109/TNN.2010.2091281
  24. Gretton A, Borgwardt K, Rasch M, et al. A Kernel Method for the Two-Sample-Problem. Advances in Neural Information Processing Systems 2006; 19. DOI: https://doi.org/10.7551/mitpress/7503.003.0069
  25. Kullback S, Leibler RA. On Information and Sufficiency. The Annals of Mathematical Statistics 1951; 22(1): 79–86. DOI: https://doi.org/10.1214/aoms/1177729694
  26. Kang G, Jiang L, Yang Y, et al. Contrastive Adaptation Network for Unsupervised Domain Adaptation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4893–4902. DOI: https://doi.org/10.1109/CVPR.2019.00503
  27. Ajakan H, Germain P, Larochelle H, et al. Domain-Adversarial Neural Networks. arXiv 2014, arXiv:1412.4446.
  28. Ganin Y, Lempitsky V. Unsupervised Domain Adaptation by Backpropagation. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 1180–1189.
  29. Glorot X, Bordes A, Bengio Y. Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), Bellevue, WA, USA, 28 June–2 July 2011; pp. 513–520.
  30. Kaggle. Telco Customer Churn, 2017. Available online: https://www.kaggle.com/datasets/blastchar/telco-customer-churn (accessed on 7 October 2024).
  31. Rodríguez P, Bautista MA, Gonzalez J, et al. Beyond one-hot encoding: Lower Dimensional Target Embedding. Image and Vision Computing 2018; 75: 21–31. DOI: https://doi.org/10.1016/j.imavis.2018.04.004
  32. Okada S, Ohzeki M, Taguchi S. Efficient Partition of Integer Optimization Problems with One-Hot Encoding. Scientific Reports 2019; 9(1): 13036. DOI: https://doi.org/10.1038/s41598-019-49539-6
  33. Patro S. Normalization: A Preprocessing Stage. arXiv 2015, arXiv:1503.06462. DOI: https://doi.org/10.17148/IARJSET.2015.2305
  34. Gajera V, Gupta R, Jana PK. An Effective Multi-Objective Task Scheduling Algorithm Using Min-Max Normalization in Cloud Computing. In Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Bangalore, India, 21–23 July 2016; pp. 812–816. DOI: https://doi.org/10.1109/ICATCCT.2016.7912111
  35. Ahmed M, Seraj R, Islam SMS. The k-Means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics 2020; 9(8): 1295. DOI: https://doi.org/10.3390/electronics9081295
  36. Sinaga KP, Yang MS. Unsupervised k-Means Clustering Algorithm. IEEE Access 2020; 8: 80716–80727. DOI: https://doi.org/10.1109/ACCESS.2020.2988796
  37. Bock HH. Clustering Methods: A History of k-Means Algorithms. In Selected Contributions in Data Analysis and Classification; Springer, Berlin/Heidelberg, Germany, 2007; pp. 161–172. DOI: https://doi.org/10.1007/978-3-540-73560-1_15
  38. Shahapure K.R, Nicholas C. Cluster Quality Analysis Using Silhouette Score. In Proceedings of the 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), Sydney, NSW, Australia, 6–9 October 2020; pp. 747–748. DOI: https://doi.org/10.1109/DSAA49011.2020.00096
  39. Ogbuabor G, Ugwoke FN. Clustering Algorithm for a Healthcare Dataset Using Silhouette Score Value. International Journal of Computer Science & Information Technology 2018; 10(2): 27–37. DOI: https://doi.org/10.5121/ijcsit.2018.10203
  40. Maćkiewicz A, Ratajczak W. Principal Components Analysis (PCA). Computers & Geosciences 1993; 19(3): 303–342. DOI: https://doi.org/10.1016/0098-3004(93)90090-R
  41. LaValley MP. Logistic Regression. Circulation 2008; 117(18): 2395–2399. DOI: https://doi.org/10.1161/CIRCULATIONAHA.106.682658
  42. Hosmer DW, Jr., Lemeshow S, Sturdivant RX. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013. DOI: https://doi.org/10.1002/9781118548387
  43. Wang Z, Shafieezadeh A. Metamodel-Based Subset Simulation Adaptable to Target Computational Capacities: The Case for High-Dimensional and Rare Event Reliability Analysis. Structural and Multidisciplinary Optimization 2021; 64: 649–675. DOI: https://doi.org/10.1007/s00158-021-02864-9
  44. Zhang C, Wang Z, Shafieezadeh A. Error Quantification and Control for Adaptive Kriging-Based Reliability Updating with Equality Information. Reliability Engineering & System Safety 2021; 207: 107323. DOI: https://doi.org/10.1016/j.ress.2020.107323
  45. Wang Z, Shafieezadeh A. Highly Efficient Bayesian Updating Using Metamodels: An Adaptive Kriging-Based Approach. Structural Safety 2020; 84: 101915. DOI: https://doi.org/10.1016/j.strusafe.2019.101915
  46. Wang Z, Shafieezadeh A. On Confidence Intervals for Failure Probability Estimates in Kriging-based Reliability Analysis. Reliability Engineering & System Safety 2020; 196: 106758. DOI: https://doi.org/10.1016/j.ress.2019.106758
  47. Zhang C, Wang Z, Shafieezadeh A. Value of Information Analysis via Active Learning and Knowledge Sharing in Error-Controlled Adaptive Kriging. IEEE Access 2020; 8: 51021–51034. DOI: https://doi.org/10.1109/ACCESS.2020.2980228
  48. Rahimi M, Wang Z, Shafieezadeh A, et al. Exploring Passive and Active Metamodeling-Based Reliability Analysis Methods for Soil Slopes: A New Approach to Active Training. International Journal of Geomechanics 2020; 20: 04020009. DOI: https://doi.org/10.1061/(ASCE)GM.1943-5622.0001613
  49. Wang Z, Shafieezadeh A. Real-Time High-Fidelity Reliability Updating with Equality Information Using Adaptive Kriging. Reliability Engineering & System Safety 2020; 195: 106735. DOI: https://doi.org/10.1016/j.ress.2019.106735
  50. Mohammadi Darestani Y, Wang Z, Shafieezadeh A. Wind Reliability of Transmission Line Models Using Kriging-Based Methods. In Proceedings of the 3th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP13, Seoul, Korea, 26–30 May 2019.
  51. Hur J, Wang Z, Shafieezadeh, A, et al. Seismic Reliability Analysis of NPP’s Nonstructural Components Using Surrogate Models. Available online: https://s-space.snu.ac.kr/handle/10371/153449 (accessed on 7 October 2024).
  52. Wang Z, Shafieezadeh A. A Parallel Learning Strategy for Adaptive Kriging-Based Reliability Analysis Methods. In Proceedings of the 13th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP13), Seoul, Korea, 26–30 May 2019; pp. 26–30.
  53. Wang Z, Shafieezadeh, A. Reliability-Based Bayesian Updating Using Machine Learning. In Proceedings of the 13th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP13), Seoul, Korea, 26–30 May 2019; pp. 26–30.
  54. Wang Z, Shafieezadeh A. ESC: An Efficient Error-Based Stopping Criterion for Kriging-Based Reliability Analysis Methods. Structural and Multidisciplinary Optimization 2019; 59: 1621–1637. https://doi.org/10.1007/s00158-018-2150-9. DOI: https://doi.org/10.1007/s00158-018-2150-9
  55. Rahimi M, Wang Z, Shafieezadeh A, et al. An Adaptive Kriging-Based Approach with weakly Stationary Random Fields for Soil Slope Reliability Analysis. In Geo-Congress 2019: Soil Erosion, Underground Engineering, and Risk Assessment; American Society of Civil Engineers: Reston, VA, USA, 2019; pp. 148–157. DOI: https://doi.org/10.1061/9780784482155.015
  56. Wang Z, Shafieezadeh A. REAK: Reliability Analysis through Error Rate-Based Adaptive Kriging. Reliability Engineering & System Safety 2019; 182: 33–45. DOI: https://doi.org/10.1016/j.ress.2018.10.004
  57. Wang Z. Reliability Analysis and Updating with Meta-Models: An Adaptive Kriging-Based Approach. Ph.D. Thesis, The Ohio State University, 2019. Available online: https://rave.ohiolink.edu/etdc/view?acc_num=osu1574789534726544 (accessed on 7 October 2024).