Smart evaluation for deep learning model: churn prediction as a product case study

ABSTRACT


INTRODUCTION
Machine learning models were used for predicting in different fields [1]- [4].Nowadays, diverse business markets reach a congestion state and face a brutal competition between different service providers.This competition arises due to the market saturation of abundant service providers and the products' offers diversity.Herein, churn prediction is a business use case, which applies various data mining techniques to detect the customers who are likely to cancel their subscription to a special service [5], [6].Customer behavior changes in line with the defined business use case.Customizing each prediction model costs redundant time and effort for each case [7].Herein, data scientists automate the data modelling process, so it generalizes the modelling process and offers it as a service [8].This research comes as part of implementing a customer relationship management (CRM) system called customer loyalty intelligent personalization (CLIP).CLIP is a smart, machine learning based personalized customer advisory system.CLIP can serve different kinds of business applications.It aims to assist E-commerce and retail businesses to retain their profits and their customer base.This paper proposes a framework for automating customer churn prediction with respect to the business use case.The paper sections are: section 2 outlined a literature review, section 3 illustrated the 1221 First essential step in churn prediction is to assess each customer's behavior.The customers' data isn't labelled or classified before as churned or not, however it can be inferred from their previous purchase transactions.This research proposed a methodology that illustrates how to infer customers' behavior if it is a possible churn or not.The main churn configurable are calculated from the real customer data, which are: purchase frequency bypass times, reduction average purchase percentage, and average reduction purchase times.Figure 1 shows the pseudo-code to annotate customer's data to churn or not churn.Firstly, it fetches customer's first and last visit dates.Secondly, it calculates the number of customer purchases times and their average purchase values.Thirdly, it starts labeling customers churned or not based on the above obtained values.After that, data is divided into training, validation, and testing sets for further processes.

CUSTOMER CHURN PREDICTION FRAMEWORK
The CNN machine learning algorithm is one of the most famous deep learning algorithms, whose main power is feature engineering without need for domain expertise [5]- [16].In this research, the CNN algorithm is applied to build the customer churn prediction model.The CNN hyperparameters such as weight constraint, dropout rate, filter numbers, dense neuron number, Kernel size, batch size, and momentum, are initialized randomly.Then, these hyperparameters are repeatedly changed to fit the built CNN model on the customer data.The output model is considered a customized churn prediction model, which can be deployed in a specific business case.
Figures 2 and 3 show the workflow to transform automatically customer raw data into a customized churn prediction model.The input data represents the customer behavior, which includes the feature columns and the actual label if the customer is considered as a churner or non-churner.The output of that workflow is a prediction model.Figure 2 shows the first basic processes to prepare data before data modelling, which are preprocessing, feature engineering, and data splitting.The input in Figure 2 is the raw customer data and the output will be three main organized datasets: training, testing, and validation datasets.The training data size is adjusted to be 60% of full data and 20% is for both validation and testing to report the model performance.Figure 3 shows the details of data auto-modelling processes that work on getting the best fitted prediction model with respect to the provided customer data.The main datasets which are previously formed, are the input for the main three auto-data modelling processes which are training, validating, and testing processes.In the auto-training process, the CNN hyperparameters form a list of combinations.These lists are automatically applied to build and train the CNN algorithm to generate different prediction models.In the auto-validating process, the validation dataset evaluates each generated CNN model in the training process; the highly accurate model is saved.In the auto-testing process, the testing dataset evaluates the accepted CNN model and reports the evaluation metrics and saves for further predicting unseen customer data.A case study presents a sample of real customer data which is used to build a customized churn prediction model.This section's purpose is to show how the prediction model is auto-trained and autoevaluated to achieve the most recommended model.The four main evaluation numbers to evaluate any supervised machine learning algorithm are: true positive (TP), true negative (TN), false positive (FP), and false negative (FN) [17], and these numbers are considered basics for various evaluation equations like accuracy and F1 score as shown in ( 1) and ( 2) respectively.In the churn prediction case, TP represents the number of truly predicted customers as a churner.TN shows the number of truly predicted customers as non-churners.FP displays the number of customers who are non-churners but the predictive algorithm has labelled them as churners.FN represents the number of customers who are churners but the predictive model has labelled them as non-churners.

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 + 𝑇𝑁 +𝐹𝑃 + 𝐹𝑁
(1) Any CNN has a collection of hyperparameters as aforementioned [16]- [23].Each hyperparameter has a preferred value range and impact on the built model performance.Herein, these hyperparameters could form different lists of combinations of various values.Each combination is applied to build, train and evaluate the CNN using the abovementioned equations.In this case study, the applied data size is 476, 119, and 149 for training, validation and testing respectively.Table 1 shows 20 out of 768 lists of combinations of CNN hyperparameters to view their impact on both model training and validation performance.The successful accepted model attained accuracy 0.78 in training and 0.77 in testing, and attained f1 score 0.85 in training and 0.83 in testing.
On the other hand, receiver operating characteristics (ROC) and area under the ROC (AUC) [24], [25] are other evaluation measurements which evaluate the model performance based on TP and FP rates.Figure 4 shows two evaluation graphs for the successful fit model generated in this case study.The right graph displays the ROC and AUC curve which shows highly reliability 0.84 in predicting unseen data, and the graph on the left F1 score for both training and testing.

CONCLUSION
In the thriving technological era, the markets are overloaded with various services providers, which escalates competition between companies to preserve their customer bases and financial gains.Churn prediction is a problem which has intrigued various researchers and business leaders recently.On the other hand, customer data modeling in each business case to generate a churn prediction model consumes too much time and effort.So, this research proposed an automated customer churn prediction service using the CNN algorithm.It facilitates generation of a deep learning churn prediction model for each business case based on their customer behavior.A case study is presented to show the automatic adaptation of CNN hyperparameters until a decision made to select the best fit model.This case study shows reliable AUC measurement reached 0.84.This research can contribute to automatically predicting and evaluating customer churn rates in both ecommerce and retail business applications.

Figure 4 .
Figure 4. Case study evaluation graphs

Table 1 .
CNN hyperparameters experiments Smart evaluation for deep learning model: churn prediction as a product case … (Esam Mohamed Elgohary) 1223