MLOps

Definition

Machine Learning Operations - practices and tools for deploying, monitoring, and managing AI models in production, similar to DevOps but for ML systems.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between MLOps and DevOps?
DevOps focuses on reliably building, testing, and deploying software code. MLOps includes those practices but adds ML-specific needs: managing training data and features, tracking experiments, versioning models, monitoring model accuracy and drift, and retraining models when data changes.
When should I use MLOps?
Use MLOps when a model is running in production and business outcomes depend on it. Common triggers include: multiple models or teams, frequent model updates, regulatory or audit needs, the need for monitoring and alerting, or when model performance can degrade over time due to changing data (drift). For a one-off prototype or a model used only in a notebook, full MLOps is usually unnecessary.
How much does MLOps cost?
Costs vary based on compute (training and inference), storage (datasets, artifacts, logs), and operational tooling (pipelines, monitoring, CI/CD). Major cost drivers are: how often you retrain, model size, traffic to inference endpoints, and retention of logs/metrics. Managed platforms (e.g., SageMaker, Azure ML, Vertex AI, OCI Data Science) charge for underlying resources you use; self-managed MLOps can reduce platform fees but increases engineering and maintenance costs.

Category: ai-ml

Difficulty: advanced

Related Terms

See Also