Experimenting and Supporting Machine Learning Lifecycle with MLflow

Posted by OodlesAI on March 3rd, 2020

eproducibility, great administration, and subsequent investigation are the essential pillars of software testing and analysis. Dynamic machine learning solutions are beginning to alternate these essential software testing practices with algorithm-driven systems. In today’s article, we are discussing one such platform that monitors the deployment and underlying intricacies of machine learning models. Developers achieve end-to-end control over the machine learning lifecycle with MLflow including code tracking, configuration, reproducible runs, and more. 

What is MLFlow?

So the answer is it's a framework that supports your machine learning lifecycle. MLFlow has components to screen your model during training and running, the capacity to store model, load the model in production and also it facilitates the creation of Pipeline.

Why do we need such a thing?

Machine Learning requires us to explore a wide range of datasets, data preparation steps, and calculations to fabricate your model which maximizes some target metrics. In fact, data processing is the foremost advantage of artificial intelligence services over legacy analytics systems. When you have constructed your model, you likewise need to deploy it in a framework or production system, monitor its performance, and continuously retrain it on new data and compare the outcomes. 

Challenges with Machine Learning Deployment

1: Keeping an Eye on Experiments: It is really hard to keep track of the experiment you perform while tuning your machine learning model. It becomes a hectic task to tell which dataset, code, and an argument is responsible for any particular outcome.

2: Difficulties in reproducing the code. Let's suppose you have done a great job in keeping up the track of all the dataset, code and argument(I know its really hard) but to reproduce the code you will need to capture the whole environment to get the same result again. It becomes exceptionally hard if you want your code to be used by some other data scientist or if you want to run your same code at a scale on other platforms say cloud.

3: Nothing standard to pack and deploy your ML Model: Each data science group has its own methodology for every ML library that it utilizes, and the connection between a model and the code and parameters that delivered it is regularly lost.

So on to resolve all the abovesaid problems the MLFlow has done a great job.

MLFlow has three components that could provide help in managing the Machine Learning workflow.


  1. MLflow Tracking

    MLflow facilitates the execution of ML code and the visualization of outcomes by providing an API and UI for logging code version, parameter, artifacts, and metrics.

This tracking can be done in any sort of environment to log outcomes to local files or to a server. Teams can also utilize this check out the results from different users.


  1. MLFlow Projects

    This provides a standard format for packaging your reusable data science code. Each Project is just a directory having code or a git repository and uses a file to specify its dependencies, and a way to run the code. For instance, activities can contain a conda.YAML document for indicating a Python Conda condition. 

At the point when you utilize the MLflow Tracking API in a Project, MLflow naturally remembers the project version and parameters You can without much of a stretch run existing MLflow Projects from GitHub or your own Git store, and chain them into multi-step work processes.

Image Credits Medium

3: MLFlow Models

MLflow Models offer a protocol for bundling ML models in various flavors and a variety of tools to assist you with deployment. Each Model is saved as a directory containing files and a descriptor document that rundowns a few "flavors" the model can be utilized in.

For instance, a TensorFlow model can be stacked as a TensorFlow DAG, or as a Python function to apply to input data. MLflow gives instruments to deploy numerous regular model sorts to assorted stages: for instance, any model supporting the "Python function" flavor can be sent to a Docker-based REST server, to cloud platforms as in, Azure ML and AWS SageMaker, and as a client-defined work in Apache Spark for bunch and streaming inference. On the off chance that you yield MLflow Models utilizing the Tracking API, MLflow additionally consequently remembers which Project and run they originated from.

ALl these MLflow features make it a favorable catalyst to support machine learning lifecycles.

Like it? Share it!


About the Author

Joined: June 21st, 2019
Articles Posted: 99

More by this author