From Code to Production — How Machine Learning Models Are Developed, Deployed, and Monitored

Ajay
3 min readOct 2, 2024

--

Listen to this here: https://open.spotify.com/show/5lSPSDsjZDFB5ov5qMRq75

Ever wondered what happens after a machine learning model is coded? For many engineers, the lifecycle of a model — how it’s developed, deployed, and monitored — can be a bit of a mystery. This article breaks down the practical steps involved in taking a model from concept to production in real-world applications.

This isn’t about large language models or the latest AI trends — it’s about the core engineering work behind any ML model. You’ll get a clear, actionable guide to how models are built, tested, and deployed, and learn the ongoing processes to ensure they perform well over time.

Whether you’re looking to expand your skill set or understand the full picture of the ML lifecycle, this guide will show you what it takes to turn machine learning models into reliable, scalable systems in production. Let’s get started!

Data Science Model Development Lifecycle (DSMDL)

The Data Science Model Development Lifecycle (DSMDL) involves various roles collaborating to build, deploy, and maintain a machine learning model. Each role has distinct responsibilities across different phases of the lifecycle.

Key Phases and Roles

Different Phases of Data Science Model Development Lifecycle

Data Analysts

  • Identify areas where data science models can benefit the organization (e.g., detecting account takeovers (ATOs) or promotion abuse).
  • Collaborate with Data Scientists to provide relevant data sources for feature creation.
  • Assist in labeling model outputs during the development phase and when the model is in production to ensure consistent accuracy.

Data Scientists

  • Interpret problem statements from Data Analysts and stakeholders.
  • Explore data sources for creating new features for experimentation
  • Conduct experiments with different models and features, often backtesting for date-sensitive models.
  • Once done with backtesting, they have set of final models which they serve in batch mode with/without help of ML Engineers.
  • Once batch serving shows promising results, work with ML Engineers to deploy models online.
  • Once online serving is live, they monitor the model with help from ML Engineer.
  • To make sure model doesn’t drift from their training performance, they take help from Data Analysts for labelling some of the sample model outputs.
  • Even after a model goes live in production, they need to constantly monitor and retrain models as data is ever changing because of user behaviour or a new feature introduced in the application..

ML Engineer:

  • Focus on model deployment and monitoring.
  • Understanding the requirement for deploying/serving data science models from Data Scientists.
  • Finding out data sources, building ETL/feature engineering pipelines using these data sources.
  • Deploy models, expose APIs for model consumption, and handle compliance aspects if necessary.
  • Work with Data Scientists to ensure models are properly integrated into production environments.
  • Taking sanction based on the model results if needed.

MLOps/MLPlatform Engineer:

  • Develop tools and infrastructure for scalable, efficient model development and deployment.
  • Key responsibilities include:
  • Providing auto-scaling infrastructure (e.g., Kubeflow Notebook Servers, Kubeflow Pipelines).
  • Managing workflow orchestration for feature engineering, training, and model pipelines.
  • Offering tools for experimentation tracking (e.g., MLflow) and feature engineering (e.g., Kafka, Feast, Firehose, Daggers, optimus etc).
  • Facilitating labeling and monitoring tools (e.g., CVAT for labeling, Grafana/Kibana for performance tracking).

Relevant Links

--

--