MLOps: Model management, deployment and monitoring with Azure Machine Learning

learn how to use Azure Machine Learning to manage the lifecycle of your models. Azure Machine Learning uses a Machine Learning Operations (MLOps) approach. MLOps improves the quality and consistency of your machine learning solutions.

Azure Machine Learning provides the following MLOps capabilities:

  • Create reproducible ML pipelines. Pipelines allow you to define repeatable and reusable steps for your data preparation, training, and scoring processes.
  • Register, package, and deploy models from anywhere and track associated metadata required to use the model.
  • Capture the governance data required for capturing the end-to-end ML lifecycle, including who is publishing models, why changes are being made, and when models were deployed or used in production.
  • Notify and alert on events in the ML lifecycle such as experiment completion, model registration, model deployment, and data drift detection.
  • Monitor ML applications for operational and ML-related issues. Compare model inputs between training and inference, explore model-specific metrics, and provide monitoring and alerts on your ML infrastructure.
  • Automate the end-to-end ML lifecycle with Azure Machine Learning and Azure DevOps to frequently update models, test new models, and continuously roll out new ML models alongside your other applications and services.

Create reproducible ML pipelines

Use ML pipelines from Azure Machine Learning to stitch together all of the steps involved in your model training process.

An ML pipeline can contain steps from data preparation to feature extraction to hyperparameter tuning to model evaluation.

If you use the Designer to create your ML pipelines, you may at any time click the “…” at the top-right of the Designer page and then select Clone. Cloning your pipeline allows you to iterate your pipeline design without losing your old versions.

Register, package, and deploy models from anywhere

Model registration allows you to store and version your models in the Azure cloud, in your workspace. The model registry makes it easy to organize and keep track of your trained models.

Registered models are identified by name and version. Each time you register a model with the same name as an existing one, the registry increments the version. Additional metadata tags can be provided during registration. These tags are then used when searching for a model. Azure Machine Learning supports any model that can be loaded using Python 3.5.2 or higher.

You can’t delete a registered model that is being used in an active deployment.

Package and debug models

Before deploying a model into production, it is packaged into a Docker image. In most cases, image creation happens automatically in the background during deployment. You can manually specify the image.

If you run into problems with the deployment, you can deploy on your local development environment for troubleshooting and debugging.

Validate and profile models

Azure Machine Learning can use profiling to determine the ideal CPU and memory settings to use when deploying your model. Model validation happens as part of this process, using data that you supply for the profiling process.

Convert and optimize models

Converting your model to Open Neural Network Exchange (ONNX) may improve performance. On average, converting to ONNX can yield a 2x performance increase.

Use models

Trained machine learning models are deployed as web services in the cloud or locally. You can also deploy models to Azure IoT Edge devices. Deployments use CPU, GPU, or field-programmable gate arrays (FPGA) for inferencing. You can also use models from Power BI.

When using a model as a web service or IoT Edge device, you provide the following items:

  • The model(s) that are used to score data submitted to the service/device.
  • An entry script. This script accepts requests, uses the model(s) to score the data, and return a response.
  • A conda environment file that describes the dependencies required by the model(s) and entry script.
  • Any additional assets such as text, data, etc. that are required by the model(s) and entry script.

You also provide the configuration of the target deployment platform. For example, the VM family type, available memory, and number of cores when deploying to Azure Kubernetes Service.

When the image is created, components required by Azure Machine Learning are also added. For example, assets needed to run the web service and interact with IoT Edge.

Capture the governance data required for capturing the end-to-end ML lifecycle

Azure ML gives you the capability to track the end-to-end audit trail of all of your ML assets. Specifically:

  • Azure ML integrates with Git to track information on which repository / branch / commit your code came from.
  • Azure ML Datasets help you track, profile, and version data.
  • Azure ML Run history stores a snapshot of the code, data, and compute used to train a model.
  • The Azure ML Model Registry captures all of the metadata associated with your model (which experiment trained it, where it is being deployed, if its deployments are healthy).

Automate the ML lifecycle

You can use GitHub and Azure Pipelines to create a continuous integration process that trains a model. In a typical scenario, when a Data Scientist checks a change into the Git repo for a project, the Azure Pipeline will start a training run. The results of the run can then be inspected to see the performance characteristics of the trained model. You can also create a pipeline that deploys the model as a web service.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s