MLOps with Intelligent Product Essentials: A Framework for Streamlined ML Workflows

SKJB · 09-04-2024 09:30 PM

Intelligent Product Essentials helps rapidly build products to deploy AI models at edge and leverage telemetry information along with learnings to further generate useful insights and optimize AI models in the cloud. It is essential to have an MLOps pipeline to streamline each stage of workflow, considering various elements, including ML model training workflows, geo-distributed deployment and source of data streams. This helps rapidly build machine learning models to support AI at the edge with appliances and products.

This document describes the various ML workflows, architectures, and MLOps pipelines for Intelligent Product Essentials using Vertex AI. The information in this document will help you do following:

Identify various approaches to train and deploy AI models based on the use cases, latency requirements and data types etc.
Build MLOps and CI/CD pipelines to automate end-end workflow right from code commit to model deployment and monitoring.
Leverage Google Distributed Cloud Edge for ease of AI model life cycle management at edge for real time inference or prediction.
Build a scalable AI/ML deployment pipeline for real time and batch inferencing for high volume telemetry data to generate the insights.

MLOps

As described in Hidden technical debt in ML systems, the ML code is only a small part of mature ML systems. In addition to the ML code and high-quality data, you need a way to put your ML processes into operation.

MLOps is a practice which helps companies to build, deploy and put your ML system into operation in a rapid, repeatable, and reliable manner. MLOps is an application of DevOps principles to ML systems. MLOps is an engineering culture and practice that's intended to unify ML system development (Dev) and ML system operation (Ops). The objective of MLOps is to provide a set of standardized processes and technology capabilities for building, deploying, and putting ML systems into operation rapidly and reliably.

There are many users within an organization who have a part to play in the MLOps life cycle, including a data scientist who dabbles in different aspects of building and validating models, to an ML engineer who is responsible for the model to work without issues to end users in production systems, to a software engineer who writes scalable distributed systems.

The previous diagram shows the following components:

Data engineering includes data pipelining.
Software engineering includes scalable system development.
Data science includes model prototyping.

The following overlaps occur:

Data pipelining and scalable system development share CI/CD.
Scalable system development and model prototyping share model deployment
Model prototyping and data pipelining share an MVP (minimum viable product).
All three share MLOps.

The following sections discuss how MLOps can be implemented with Intelligent Products Essentials and Vertex AI.

MLOps personas

The preceding diagram shows the following component and core MLOps user personas:

Intelligent Products Essentials: stores customer data, device data, device telemetry, and ownership data across BigQuery and Cloud Storage.
Data scientists: responsible for analyzing data stored in Intelligent Products Essentials, feature engineering, model development, model evaluation, and building an ML pipeline.
ML engineers: responsible for orchestrating and hosting model deployment at scale.
Automation team: responsible for building automated pipelines for model training and deployment in staging, test, and production environments.

The subsequent sections explain the comprehensive spectrum of potential machine learning workflow architecture options for Intelligent Product Essentials in relation to diverse use cases.

ML workflow for Intelligent Product Essentials

The following diagram illustrates the various potential training and deployment scenarios for inferencing that are feasible in the context of any Intelligent Product Essential ML workflow.

These scenarios are based on the utilization of all the gathered information form Intelligent Products Essentials such as device data, device telemetry, and customer ownership data, that are stored within a data warehouse:

ML Training
- No code, low code model training workflow
  - BigQuery ML & Analytics
  - AutoML - Text, image, video and tabular data
- Custom model training workflow
  - Vertex AI Training & Pipelines
ML Deployment
- Edge deployment
- Vertex AI Endpoint

In the below section we will explore each of these points in more detail.

ML training

No code, low code model training workflow

BigQuery ML & Analytics:
BigQuery ML is a model development service within BigQuery. With BigQuery ML, SQL users can train ML models directly in BigQuery without needing to move data or worry about the underlying training infrastructure. To learn more about the advantages of using BigQuery ML, see What is BigQuery ML?

To create a model in BigQuery, use the BigQuery ML CREATE MODEL statement. This statement is similar to the CREATE TABLE DDL statement. When you run a query that contains a CREATE MODEL statement, a query job is generated for you that processes the query.

For example for XGBoost model:
```
{CREATE MODEL | CREATE MODEL IF NOT EXISTS | CREATE OR REPLACE MODEL}
model_name
[INPUT(field_name field_type, …)
 OUTPUT(field_name field_type, …)]
OPTIONS(MODEL_TYPE = 'XGBOOST', MODEL_PATH = string_value);
```
- AutoML - Text, image, video and tabular data:
  Create and train models with minimal technical knowledge and effort. To learn more about AutoML, see AutoML beginner's guide.

The above illustration elucidates the sequential stages integral to the construction of machine learning (ML) models. The process encompasses data preprocessing, feature engineering, model selection, hyperparameter tuning, evaluation, and deployment. Consequently, it transforms into a complex and time-consuming endeavor, even during the experimentation or evaluation phases of ML applications.

The AutoML product is designed to provide users with a graphical, codeless interface that guides them through the complete machine learning lifecycle, with significant automation and guardrails at each phase. This no code, low code interface facilitates the smooth definition of data schema and target, analysis of input features in the feature statistics dashboard, automated training of the model including automated feature engineering, model selection, and hyperparameter tuning. Furthermore, it enables the evaluation of model behavior prior to deployment in production and the deployment of the model with one click. Thus, it can help users reduce the time required for ML model creation from months to weeks or even days.

Custom model training workflow

The following diagram demonstrates a Vertex AI custom ML pipeline. Create and train models at scale using any ML framework. To learn more about custom training on Vertex AI, see Custom training overview.

This is an example of the Vertex AI MLOps architecture with CI/CD showcasing the following components:

Vertex AI Workbench and Colab Enterprise offers a Jupyter-based, fully managed, scalable, enterprise-ready compute infrastructure to connect to all the Google Cloud data in the organization. Data scientists can use this infrastructure as their development environment.
Vertex AI Feature Store provides a centralized repository for organizing, storing, and serving ML features. Data scientists can use Vertex AI Feature Store to store and share features across their organization. It also serves the latest feature values for online predictions at low latencies for real time inference.
Kubeflow Pipelines SDK let's data scientists build and deploy portable, scalable ML workflows based on Docker containers. After the data scientists produce a ML model, the ML Engineer or Automation team can package their training procedures into a ML pipeline using Kubeflow Pipelines SDK.
Vertex AI Pipelines provides an execution environment for ML pipelines built using the Kubeflow Pipelines SDK or TensorFlow Extended. For Intelligent Products Essentials, we recommend that you use Kubeflow Pipelines SDK. When you use Kubeflow Pipelines SDK, there are also prebuilt components such as the Google Cloud Pipeline Components for simple and rapid deployment. For the full list of prebuilt components, see the Google Cloud Pipeline Components list.
Cloud Source Repositories are fully featured, private Git repositories hosted on Google Cloud. After data scientists define their continuous training ML pipeline, they can store the pipeline definition in a source repository, like Cloud Source Repositories. This approach triggers the continuous integration and continuous deployment (CI/CD) pipeline to run.
CI/CD pipeline: builds, tests, and packages the components of the ML pipeline.
Vertex AI Training jobs are optimized for ML model training, which provides faster performance than directly running your training application on a GKE cluster. You can also identify and debug performance bottlenecks in your training job by using Vertex AI TensorBoard Profiler. Vertex AI Experiments helps in tracking and analyzing different model architectures, hyper-parameters, and training environments to identify the best model for your use case.
Vertex AI Model Registry stores different versions of trained models and their associated metadata.
Vertex AI Prediction offers two methods for getting prediction:
- Online predictions are synchronous requests made to a model endpoint.
- Batch predictions are asynchronous requests.
Vertex AI Model Monitoring monitors models for training-serving skew and prediction drift and sends you alerts when the incoming prediction data skews too far from the training baseline. You can use the alerts and feature distributions to evaluate whether you need to retrain your model.
Vertex ML Metadata helps record the metadata, parameters, and artifacts that are used while building ML model or ML pipeline. It will help analyze, debug, and audit the performance of the ML model pipeline or the artifacts that it produces.

ML Deployment

For each of the model training workflow, in addition to the deployment of Vertex AI online and batch prediction endpoint, based on the model type, the model can also be exported for edge or local deployment. For instance, BigQuery ML can be exported and deployed as an endpoint in both local and vertex endpoint environments, see exporting BQML Models for online predictions for more details.

Edge or Local deployment

Mobile, Web and low compute embedded device deployment: Trained AutoML Edge models can be exported in the various formats like Tensorflow.js, TF Lite, Core ML etc based on the model type.
Local deployment and serving: deploy exported BigQuery ML or custom trained models files generated as part of training pipeline using the supported model serving docker container at local system level.
Edge server deployment: Google Distributed Cloud Edge supports running workloads in Kubernetes containers and on virtual machines, including GPU-based workloads, which run on NVIDIA Tesla T4 GPUs.

Vertex AI Endpoints

Vertex AI online and batch prediction endpoints: ML models can be deployed to Vertex AI endpoints either as online or batch prediction based on the latency requirement. To learn more about vertex AI endpoint, see Getting predictions beginner's guide.

Intelligent Product Essentials with MLOps will accelerate the product roadmaps and help you create engaging products for better customer experiences and feedback loops for various use cases. Refer to the Intelligent Product Essentials use case section for more details.

What's next

Learn more about Vertex AI.
Learn more about ML problem framing.
Read about continuous delivery and automation pipelines in ML
Read about MLOps for practitioners.
Read about Google Distributed Cloud Edge.
For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.

Contributors

Authors:

Sunil Kumar Jang Bahadur | Customer Engineer, AI & GenAI Specialist
Lakshmanan Sethu Sankaranarayanan | TAM ,Gen AI

Other contributors:

Harish Verma | Cloud Engineer
Marwan Al--Shawi | Cloud Partner Engineer
Nandini Bhardwaj | Customer Engineer, Google Cloud

(Note: to see nonpublic LinkedIn profiles, sign in to LinkedIn.)

A comprehensive guide to MLOps with Intelligent Products Essentials