Model Monitoring
Definition
Continuously tracking AI model performance, data quality, and system health in production to detect issues early and ensure reliability.
Use Cases
- Uber: Monitoring ML models used for marketplace predictions (e.g., ETA, demand/supply forecasting) to detect performance regressions and data drift in production. — Uber built an internal ML platform (Michelangelo) that logs model inputs/outputs, tracks model performance over time, and triggers alerts when metrics degrade or data distributions shift. It supports continuous evaluation and operational dashboards for deployed models. (Earlier detection of model regressions and data issues, improving reliability of ML-driven product features and reducing time to diagnose production incidents.)
- Netflix: Monitoring recommendation and personalization models to ensure ranking quality remains stable as user behavior and content catalog change. — Netflix uses extensive telemetry, experimentation (A/B testing), and model performance tracking to watch key business and model metrics (e.g., engagement proxies) and to detect shifts that can indicate drift or unintended changes in model behavior. (More stable personalization performance and faster identification of issues that could negatively affect member experience.)
- Airbnb: Monitoring search ranking and fraud/risk models to catch changes in data patterns and model performance after launches or seasonal shifts. — Airbnb has described using an internal ML platform with logging, offline/online evaluation, and dashboards to track model health and performance, enabling teams to detect drift and validate model changes with experiments. (Reduced risk of silent model failures and improved confidence in model deployments through measurable, ongoing performance checks.)
Provider Equivalents
- AWS: Amazon SageMaker Model Monitor
- Azure: Azure Machine Learning (Azure ML) - Model Monitoring (via Azure ML monitoring with Azure Monitor / Application Insights integration)
- GCP: Vertex AI Model Monitoring
- OCI: OCI Data Science - Model deployment monitoring (via OCI Monitoring/Logging and custom metrics)
Frequently Asked Questions
- What's the difference between Model Monitoring and Observability?
- Model monitoring focuses specifically on ML behavior in production—things like prediction accuracy, data drift, bias signals, and feature quality. Observability is broader and covers the whole system (services, infrastructure, logs, traces). In practice, model monitoring is usually built on top of observability tools, adding ML-specific metrics and checks.
- When should I use Model Monitoring?
- Use it whenever an ML model is making production decisions that matter (revenue, safety, compliance, customer experience). It’s especially important when data changes over time (seasonality, new user behavior, new products), when you can’t easily see errors immediately, or when you need auditability for regulated use cases.
- How much does Model Monitoring cost?
- Cost depends on (1) how much data you log (features, predictions, labels), (2) how often you run monitoring jobs (real-time vs hourly/daily), (3) storage and retention, (4) compute for drift/quality calculations, and (5) alerting and dashboarding tools. Managed services (e.g., SageMaker Model Monitor or Vertex AI Model Monitoring) typically charge for monitoring compute and associated logging/storage; DIY approaches shift costs to your own compute, metrics, and log storage.
Category: ai-ml
Difficulty: intermediate
Related Terms
See Also