ML System in Production!!!
Updated: Jul 16, 2020
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves.
In general, machine learning algorithms begin with an initial hypothetical model, determine how well this model fits a set of data, and then work on improving the model iteratively. This training process continues until the algorithm can find no additional improvements, or until the user stops the process.
A typical machine learning project will include the following high-level steps that will transform a loose data hypothesis into a model that serves predictions.
Business needs and desired outcomes drive the purpose of this exercise, so that is the first step in this process. Once this is clear and well defined, steps 2 through 4 depicted above come into play. One of the key facets of the iterative process is to deploy machine learning models as the system learns new aspects/dimensions of data. There are two types of data deployment widely used in the industry- Batch mode and Real time.
Batch data processing
Batch data processing is an efficient way of processing high volumes of data where a group of transactions is collected over a period of time. Data is collected, entered, processed and then the batch results are produced. Batch processing requires separate programs for input, process and output. An example is payroll and billing systems.
Real Time Mode:
In contrast, real time data processing involves a continual input, process and output of data. Data must be processed in a small time period (or near real time). Radar systems, customer services and bank ATM's are examples.
So the choice of deploying the model in the batch mode vs the real time mode is primarily driven by factors namely, who is using the inferences and how soon do they need them?
In order to understand the options available for deploying the models as Batch/ Real time prediction, one needs to have an understanding of Serialization.
Serialization converts the model in the form of a python object into a character stream. The idea is that this character stream contains all the information necessary to reconstruct the object in another python script.
There are two ways to serialize a model in Python: Pickle and Joblib.
Python has a module called pickle which helps to save our model and load it later for scoring.The model can be saved using pickle.dump() function with any name or extension and can be loaded using pickle.load() and it is pointed to the name/extension.Here’s a snapshot of the pickle load and dump functionality.
Joblib is used for saving the model and works very similar to pickle. This library is available as part of sklearn.externals. Here’s a snapshot of saving and loading the model for prediction using joblib.
We can store the pipeline steps in pickle or joblib objects to apply similar pipeline steps on the new data. Now, let’s look at the different options available for deploying the models as Batch/ Real time prediction.
a. Using Scheduler jobs
We can productionize the python model in batch mode using the windows schedulers/cron jobs. The batch prediction on AWS can be done by deploying the model on lambda.
These jobs pick up the scripts and execute it during the scheduled time period. As part of deployment we will load the pickle or joblib object ,configure the input data source and the destination for the prediction output.
Real Time Prediction:
a. Exposing the model as API:
We can expose the model as an API to get the prediction using the Flask micro service framework. Flask will allow us to expose business logic as API. As discussed earlier in the article , we will need to Serialize the model (using Pickle or JobLib) before using Flask to expose the model as an API.
The exposed API can be used either in a single input mode or a file input (with multiple records) to get the predicted output from the deployed model.
b. Docker Containerization
Dockerization is another way of deploying the ML models.It is one of the best managed services where you can deploy your application on your own cloud/server or to your client environment. There are no OS issues,environment issues and version conflicts.Dockers helps us with reproducibility, portability and ease of deployment. Docker container has its own set of storage and network systems and it has the underlying defaut LINUX OS..
In the next article let’s delve deeper into the Real Time prediction using API and will provide step by step example by leveraging a Random Forest Model to show it in action.
Please let me know your suggestions and questions in the comments section. Stay tuned!!!!
Schedule a call with me to learn more!