Use Configuration-Based Dependency Injection on TFLearn to Improve Iterative Deep Learning Development Process

Do it!

Colin Lu

Updated Nov. 01, 19 · Analysis

Likes (5)

Comment

Save

6.2K Views

How deep is your learning?

Introduction

Deep learning has been proven as a key benefit to all aspects of business development. By using the deep learning frameworks, such as TFLearn, a deep learning library featuring a higher-level API for TensorFlow, we can quickly develop and train a model to perform accurate and intuitive cognitive tasks.

To develop a good deep learning model is an iterative process consisting of steps and sub-tasks which require big collaborations from the teams of data scientists, machine learning engineers, and IT deployment support.

This article analyzes the pain points of the interactive deep learning model development process and proposes a solution using configuration-based dependency injection techniques in python to build the neural network classifiers on TFLearn.

With a step-by-step guide of building a sample application, A Smart Customer Retention System, we will see how this proposed solution will simplify the development process and decouple the tasks of data engineering and the model design and training. Furthermore, the model development will be more productive by reducing the development complexity and the dependencies between the teams. The source code of the sample project can be found at GitHub.

Iterative Deep Learning

Developing a deep learning model using TFLearn follows a cycle of stages of iterative sub-processes. Typically, the development is consisting of iterations at different stages that help to build the model to reach the desired cognitive results.

Figure 1. A Typical Iterative Deep Learning Development Process using TFLearn

Data Iteration: A deep learning project starts with data engineering. In this iteration, the data scientists shall:
- Analyze business requirements.
- Obtain the raw data and perform the data transformations.
- Extract the features and create raw datasets.
Model Iteration: The model design/evaluation iteration starts once the datasets are ready.
- In this iteration, data scientists shall:
  - Make datasets available in the desired format.
  - Pre-process the datasets.
- In the same iteration, machine learning engineers shall:
  - Design the neural network model.
  - Train the model using the pre-processed datasets.
  - Evaluate the model by checking accuracy and loss.
- If the model behaves poorly, both machine learning engineers data scientists need to evaluate the results, identify the causes, make the fix and then iterate the cycle again.
Release Iteration: After the model is trained and tested, it becomes the release iteration. In this iteration, the IT development team shall:
- Expose the model as microservices.
- Deploy and publish services.

The above iterative development process has the following disadvantages:

Tightly coupled: The stages of data engineering, model design, and model training are tightly coupled with strong dependencies. For example, to train a model, the machine learning engineer must know the details of the dataset.
Poor maintainability: Although TFLearn has reduced the coding efforts in building a deep learning model, the side effect is that multiple teams are working on the same source code which makes the system hard to maintain.

Is there a solution to solve the issues?

Use Configuration-Based Dependency Injection on TFLearn

To improve the development process, let's propose a solution with the following advantages:

Configuration-based development — By developing the application metadata as the YAML configuration files, each team shall only maintain their configurations files. Configuration-based development removes the dependencies across the teams and the stages.
Better maintainability — The solution provides pre-built python components as the TFLearn API wrappers for data processing, model instantiation, and result handling. These components are injected into the classifier according to the classifier's dependency configurations.
Separation of concern — the core python components are maintained by the IT team. Data scientists and machine learning engineers shall solely focus on the data manipulating and model tuning by developing the configuration files.

Figure 2. Develop Deep Learning Classifiers using Configuration-Based Dependency Injection

Let’s have an in-depth observation of the architecture and the implementation:

Figure 3. Configuration-based Dependency Injection

Singleton Application Context

A singleton Application Context object works as a classifier factory. It loads the application metadata and instantiates the classifiers using python reflection techniques.

class ApplicationContext:
    """ This is a singleton class initializing the application and working as a classifier factory. """
    __instance = None
    @staticmethod
    def get_instance(name):
        """ Static method of getting the singleton instance """
        if ApplicationContext.__instance is None:
            ApplicationContext(name)
        return ApplicationContext.__instance
    def __init__(self, name):
        """ Private constructor instantiates a singleton object.
            Create classifier instances using reflection based on configurations.
        """
        if ApplicationContext.__instance is None:
            self.classifier_instances = {}
            with open(name) as f:
                self.classifiers = yaml.safe_load(f)['classifiers']
            for classifier in self.classifiers:
                self.classifier_instances[classifier] = getattr(__import__(self.classifiers[classifier]['module']),
                    self.classifiers[classifier]['class_name'])(
                        classifier, self.classifiers[classifier])
            ApplicationContext.__instance = self
    def get_classifier(self, classifier):
        """ Method to get classifier instance by its name"""
        return self.classifier_instances[classifier]

Classifier

Classifier components are instantiated and maintained by the application context. It provides core functions to train the model and predict the data. A classifier contains three sub-components: a data processor, a neural network model, and a result handler. The required sub-components are injected into the classifier by the classifier configuration.

Below is a sample of classifier:

class CsvDataClassifier:
    """A Classifier providing the prediction for data"""
    def __init__(self, name, conf):
        """ Private constructor:
            instantiate a classifier with the following sub objects by loading the configurations
            a) build data processor
            b) build neural network
            c) build result handler
        """
        self.name = name
        self.conf = conf
        self.data_loader = getattr(__import__(self.conf['data_loader']['module']),
        self.conf['data_loader']['class_name'])(
            self.conf['data_loader']['data_config'])
        self.neural_network = getattr(__import__(self.conf['neural_network']['module']),
        self.conf['neural_network']['class_name'])(name,
            self.conf['neural_network']['dnn_config'])
        self.result_handler = getattr(__import__(self.conf['result_handler']['module']),
            self.conf['result_handler']['class_name'])(self.conf['result_handler']['handler_config'])
    def build_model(self):
        """ Method to build the classifier:
            a) load and process the data
            b) build the neural network model
            c) train the model
        """
        self.data_loader.pre_process()
        self.neural_network.build()
        self.neural_network.train(self.data_loader.rows, self.data_loader.labels, True)
        return 'Classifier model ' + self.name + ' is built successfully.'
    def predict(self, data):
        """ Method to predict by the input data """
        ids = [row[0] for row in data]
        data = self.data_loader.process_data(data)
        results = self.neural_network.predict(data)
        return self.result_handler.handle_result(ids, results)

Data Processor

The data processor performs the data pre-process. It is instantiated from the configurations that are designed by data scientists. As shown below is a CSV data processor.

import yaml
import numpy as np

from tflearn.data_utils import load_csv


class CsvLoader:
    """A data processor for the csv dataset"""
    def __init__(self, conf_file):
        """ Private constructor initializing the csv data loader with configurations"""
        with open(conf_file) as f:
            # use safe_load instead load
            self.data_conf = yaml.safe_load(f)

        self.rows, self.labels = load_csv(self.data_conf['data_file'],
                                          target_column=self.data_conf['target_column'],
                                          categorical_labels=True, n_classes=self.data_conf['n_classes'])

        self.columns_to_ignore = self.data_conf['columns_to_ignore']
        self.conversion_map = self.data_conf['conversion_map']

    def __pre_process_data__(self, rows):
        """private method of csv data pre-processing"""
        if rows is None:
            rows = self.rows
        # Sort by descending id and delete columns
        for column_to_ignore in sorted(self.columns_to_ignore, reverse=True):
            [row.pop(column_to_ignore) for row in rows]
        for i in range(len(rows)):
            # Converting data by converting_map
            for j in range(len(rows[i])):
                if j in self.conversion_map:
                    rows[i][j] = self.conversion_map[j].index(rows[i][j])
        return np.array(rows, dtype=np.float32)

    def pre_process(self):
        """public method of data pre-processing"""
        self.rows = self.__pre_process_data__(None)

    def process_data(self, rows):
        """public method of data processing"""
        return self.__pre_process_data__(rows)

Neural Network Model

The classifier instantiates the neural network model from the configurations of the model hyperparameters that are prepared by machine learning engineers. By consuming the pre-processed dataset from the data processor, the classifier invokes the neural network model component to train the model and persist the model for the data prediction.

import yaml
import tflearn


class DnnModel:
    def __init__(self, name, conf_file):
        with open(conf_file) as f:
            # use safe_load instead load
            self.model_conf = yaml.safe_load(f)
        self.model = None
        self.name = name
        self.path = 'model/' + self.name + '/' + self.name + '.tfmodel'

    def __load__(self):
        self.model.load(self.path)

    def build(self):
        net = tflearn.input_data(shape=self.model_conf['input_layer']['shape'])
        for i in range(len(self.model_conf['hidden_layers'])):
            net = getattr(tflearn, self.model_conf['hidden_layers'][i]['type'])(net, self.model_conf[
                'hidden_layers'][i]['neuron'], activation=self.model_conf['hidden_layers'][i]['activation'])
            if self.model_conf['hidden_layers'][i]['dropout'] is not None:
                net = tflearn.dropout(net, self.model_conf['hidden_layers'][i]['dropout'])

        net = getattr(tflearn, self.model_conf['output_layer']['type'])(net, self.model_conf[
            'output_layer']['neuron'], activation=self.model_conf['output_layer']['activation'])

        net = tflearn.regression(net, optimizer=self.model_conf['regression']['optimizer'],
                                 loss=self.model_conf['regression']['loss'])

        # Define model
        self.model = tflearn.DNN(net)
        return self.model

    def train(self, data, labels, save_model):
        self.model.fit(data, labels, n_epoch=self.model_conf['fit']['n_epoch'],
                       batch_size=self.model_conf['fit']['batch_size'],
                       show_metric=self.model_conf['fit']['show_metric'])
        if save_model:
            self.model.save(self.path)

    def predict(self, data):
        if self.model is None:
            self.build()
            self.__load__()
        return self.model.predict(data)

Result Handler

The result handler is an optional component of a classifier. It translates the classification result in meaningful and human-readable data. The result handler is business-specific.

As shown below is a sample result handler handling the customer churn prediction. It returns the customer's loyalty score and the churn prediction as a Yes or No.

class CustomerChurnHandler:
    class CustomerChurnResult:
        def __init__(self, customer_id, churn, score):
            self.id = customer_id
            self.churn = churn
            self.score = score

    def __init__(self, conf_file):
        self.results = []

    def handle_result(self, ids, data):
        self.results = []
        for i in range(len(data)):
            result = CustomerChurnHandler.CustomerChurnResult(ids[i], 'Yes' if data[i][1] >= 0.5 else 'No', round(data[i][0] * 100, 2))
            self.results.append(result)
        return self.results

Next, let’s look at a sample use case to see how this proposed solution can simplify the model development.

Sample Use Case: A Smart Customer Retention System

The sample Smart Customer Retention use case illustrates how deep learning adds value to a Telecom BSS. In this scenario, a telecom service provider notices an increased cost of the manual efforts of the customer retention process. The business team initiates a strategic project to address the issue by building a system using deep learning.

• The system should be able to predict customer churn to save the cost of the customer management process.

• In case of customer is predicted to be churn, the system should process the customer disconnect automatically.

• In case of customer is predicted to stay, the system should provide the recommendations of the offers to customer retention.

• The retention department will be able to work with the customer by choosing the offer from the recommendations.

Figure 4. Business Process for a Customer Retention System

As shown in the diagram above, there are two deep learning classifiers:

Predict customer churn.
Recommend a list of retention offers for a customer.

Let's see how to build the above classifiers by writing the configuration files rather than python coding:

Application Metadata

First, we should build application metadata. It defines two classifiers: Customer Churn and Customer Retention. Each classifier contains a set of configurations for the data processor, the neural network model, and the result handler.

#Configuration for classifiers
classifiers:
  customer_churn: #customer churn classifier
    module: data_classifier
    class_name: CsvDataClassifier
    data_loader:
      module: csv_loader
      class_name: CsvLoader
      data_config: conf/customer_churn/data_conf.yaml
    neural_network:
      module: dnn_model
      class_name: DnnModel
      dnn_config: conf/customer_churn/dnn_conf.yaml
    result_handler:
      module: customer_churn_handler
      class_name: CustomerChurnHandler
      handler_config:
  customer_retention: #customer retention recommender
    module: data_classifier
    class_name: CsvDataClassifier
    data_loader:
      module: csv_loader
      class_name: CsvLoader
      data_config: conf/customer_retention/data_conf.yaml
    neural_network:
      module: dnn_model
      class_name: DnnModel
      dnn_config: conf/customer_retention/dnn_conf.yaml
    result_handler:
      module: customer_retention_handler
      class_name: CustomerRetentionHandler
      handler_config: conf/customer_retention/handler_conf.yaml

Data Engineering

Second, data scientists analyze the raw data and prepare the original CSV dataset. Below is a sample dataset of the customer churn:

Data scientists shall provide the configurations based on the raw dataset in CSV. The configuration file defines the dataset schema and the rules of data pre-processing.

#dataset processing configurations for customer churn
data_file: data/customer_churn.csv
target_column: 0
n_classes: 2
columns_to_ignore:
  - 0
conversion_map:
  0:
    - 'Female'
    - 'Male'
  1:
    - 'K-12 student'
    - 'Unemployed'
    - 'College/Grad student'
    - 'Food Preparation and Serving Related'
    - 'Personal Care and Service'
    - 'Building and Grounds Cleaning and Maintenance'
    - 'Farming, Fishing, and Forestry'
    - 'Healthcare Support Occupations'
    - 'Self-employed'
    - 'Transportation and Material Moving'
    - 'Office and Administrative Support'
    - 'Production Occupations'
    - 'Sales and Related'
    - 'Retired'
    - 'Protective Service'
    - 'Installation, Maintenance, and Repair'
    - 'Community and Social Service'
    - 'Construction and Extraction'
    - 'Other - Not specified'
    - 'Education, Training, and Library'
    - 'Arts, Design, Entertainment, Sports, and Media'
    - 'Life, Physical, and Social Science'
    - 'Business and Financial Operations'
    - 'Healthcare Practitioners and Technical'
    - 'Architecture and Engineering'
    - 'Computer and Mathematical'
    - 'Legal Occupations'
    - 'Executive/Management'
  3:
    - 'No'
    - 'Yes'
  4:
    - 'No'
    - 'Yes'
  6:
    - 'No'
    - 'Yes'
  7:
    - 'No'
    - 'Yes'
  8:
    - 'Bundle'
    - 'Wireless'
    - 'Internet'
    - 'TV'
    - 'Home Phone'

Model Configuration and Tuning

Now comes the fun part of neural network design. Machine learning engineers will provide the configurations of the neural network model. A typical configuration includes the hyperparameters such as the layers, the number of neurons each layer, the activation function, the learning rate optimization algorithm, and cost function, etc.

The sample below shows the configurations of building a deep neural network of one input layer, three hidden layers, and one output layer.

#Neural network configurations
input_layer:
  shape:
    -
    - 9
hidden_layers:
    - type: 'fully_connected'
      neuron: 32
      activation: 'relu'
      dropout:
    - type: 'fully_connected'
      neuron: 64
      activation: 'relu'
      dropout:
    - type: 'fully_connected'
      neuron: 32
      activation: 'relu'
      dropout:
output_layer:
    type: 'fully_connected'
    neuron: 2
    activation: 'softmax'
regression:
  optimizer: 'adam'
  loss: 'categorical_crossentropy'
fit:
  n_epoch: 10
  batch_size: 16
  show_metric: True

Train the Model

Below is the python code to train the model:

app_context = ApplicationContext.get_instance('conf/application_context.yaml')
app_context.get_classifier('customer_churn').build_model()

The result is shown below:

| Adam | epoch: 010 | loss: 0.46729 - acc: 0.7987 -- iter: 099888/100000
Training Step: 62494  | total loss: 0.45866 | time: 12.523s
| Adam | epoch: 010 | loss: 0.45866 - acc: 0.8063 -- iter: 099904/100000
Training Step: 62495  | total loss: 0.47568 | time: 12.525s
| Adam | epoch: 010 | loss: 0.47568 - acc: 0.8007 -- iter: 099920/100000
Training Step: 62496  | total loss: 0.47255 | time: 12.527s
| Adam | epoch: 010 | loss: 0.47255 - acc: 0.8081 -- iter: 099936/100000
Training Step: 62497  | total loss: 0.45236 | time: 12.529s
| Adam | epoch: 010 | loss: 0.45236 - acc: 0.8041 -- iter: 099952/100000
Training Step: 62498  | total loss: 0.45618 | time: 12.531s
| Adam | epoch: 010 | loss: 0.45618 - acc: 0.8083 -- iter: 099968/100000
Training Step: 62499  | total loss: 0.46965 | time: 12.532s
| Adam | epoch: 010 | loss: 0.46965 - acc: 0.7962 -- iter: 099984/100000
Training Step: 62500  | total loss: 0.46448 | time: 12.533s
| Adam | epoch: 010 | loss: 0.46448 - acc: 0. 8148 -- iter: 100000/100000

Predict Customer’s Churn and Get Loyalty Score

After we trained the model, we can run the customer churn classifier to predict the three customers: Tony, John, and Mary.

tony = ['4d316bef-9856-4ea0-aed0-a53e55fed3db', 'Male', 'K-12 student', 20, 'No', 'Yes', 12, 'No', 'No', 'Wireless']
john = ['4d316bef-9856-4ea0-aed0-a53e55fed3df', 'Male', 'Sales and Related', 75, 'Yes', 'Yes', 26, 'Yes', 'No', 'Wireless']
mary = ['7150ae6c-1120-4eb5-b788-0f822f986fae', 'Female', 'Executive/Management', 90, 'No', 'No', 36, 'Yes', 'Yes', 'Bundle']
result = app_context.get_classifier('customer_churn').predict([tony, john, mary])
print("Customer churn prediction for Tony - churn: {} - score: {}".format(result[0].churn, result[0].score))
print("Customer churn prediction for John - churn: {} - score: {}".format(result[1].churn, result[1].score))
print("Customer churn prediction for Mary - churn: {} - score: {}".format(result[2].churn, result[2].score))

As shown below is the Customer Churn prediction result:

Customer churn prediction for Tony - churn: Yes - score: 1.53
Customer churn prediction for John - churn: No - score: 73.91
Customer churn prediction for Mary - churn: No - score: 99.41

Provide Recommended Offers for Customer Retention

Below is the sample code to run the recommendation engine for customer retention.

tony = ['4d316bef-9856-4ea0-aed0-a53e55fed3db', 'Male', 'K-12 student', 20, 'No', 'Yes', 12, 'No', 'No', 'Wireless']
john = ['4d316bef-9856-4ea0-aed0-a53e55fed3df', 'Male', 'Sales and Related', 75, 'Yes', 'Yes', 26, 'Yes', 'No', 'Wireless']
mary = ['7150ae6c-1120-4eb5-b788-0f822f986fae', 'Female', 'Executive/Management', 90, 'No', 'No', 36, 'Yes', 'Yes', 'Bundle']
result = app_context.get_classifier('customer_retention').predict([tony, john, mary])
print("Top3 retention recommendations for Tony: \n 1: {} \n 2: {}  \n 3: {}".format(
                result[0].recommendations[0],
                result[0].recommendations[1],
                result[0].recommendations[2]))
print("Top3 retention recommendations for John: \n 1: {} \n 2: {}  \n 3: {}".format(
                result[1].recommendations[0],
                result[1].recommendations[1],
                result[1].recommendations[2]))
print("Top3 retention recommendations for Mary: \n 1: {} \n 2: {}  \n 3: {}".format(
                result[2].recommendations[0],
                result[2].recommendations[1],
                result[2].recommendations[2]))

Below is the result of the top three recommended retention offers for each customer:

Top3 retention recommendations for Tony: 
 1: Service or Plan Downgrade 
 2: Service Discount or Account Credit  
 3: Extended Promotion or Price Lock Down

Top3 retention recommendations for John: 
 1: Additional Add-on Services 
 2: Service Discount or Account Credit  
 3: Extended Promotion or Price Lock Down

Top3 retention recommendations for Mary: 
 1: Additional Service Usage 
 2: Plan Upgrade  
 3: Service Quality Upgrade

Conclusion

Although building deep learning models becomes easier by using TFLearn, due to the nature of the iterative development process, multiple teams are involved with coupled tasks. This article introduces a configuration-based dependency injection solution that removes the dependencies between the development stages and eliminates python programming.

By using the proposed solution, each team shall solely focus on provisioning the configurations files. With a sample project on GitHub, we can see how it eases the collaboration and increases productivity with a clearly defined range of responsibilities.

Use Configuration-Based Dependency Injection on TFLearn to Improve Iterative Deep Learning Development Process

Do it!

Introduction

Iterative Deep Learning

Use Configuration-Based Dependency Injection on TFLearn

Singleton Application Context

Classifier

Data Processor

Neural Network Model

Result Handler

Sample Use Case: A Smart Customer Retention System

Application Metadata

Data Engineering

Model Configuration and Tuning

Train the Model

Predict Customer’s Churn and Get Loyalty Score

Provide Recommended Offers for Customer Retention

Conclusion

Further Reading

Partner Resources

Related

Trending

Use Configuration-Based Dependency Injection on TFLearn to Improve Iterative Deep Learning Development Process

Do it!

Introduction

Iterative Deep Learning

Use Configuration-Based Dependency Injection on TFLearn

Singleton Application Context

Classifier

Data Processor

Neural Network Model

Result Handler

Sample Use Case: A Smart Customer Retention System

Application Metadata

Data Engineering

Model Configuration and Tuning

Train the Model

Predict Customer’s Churn and Get Loyalty Score

Provide Recommended Offers for Customer Retention

Conclusion

Further Reading

Related

Partner Resources