Data Drift

Definition

When the statistical properties of input data change over time compared to the training data, potentially degrading model performance.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between data drift and concept drift?
Data drift means the input data (features) changes over time—for example, customers start browsing different categories. Concept drift means the relationship between inputs and the target changes—for example, the same browsing behavior no longer predicts purchases because a competitor changed prices or a new policy changed buying decisions. You can have data drift without concept drift, and vice versa.
When should I monitor for data drift?
Monitor for data drift when your model runs in production and the real-world environment can change—common in retail, ads, fraud, finance, logistics, and any system influenced by seasonality, campaigns, product changes, or user behavior. It’s especially important when model errors are costly (fraud losses, compliance risk, customer churn) or when you can’t label outcomes quickly (making performance drops harder to detect directly).
How much does data drift monitoring cost?
Costs depend on (1) how much data you monitor (volume and frequency), (2) where you store baselines and logs, (3) how often you run drift calculations, and (4) alerting/visualization and any retraining you trigger. In managed services, you typically pay for underlying compute (monitoring jobs), storage (logs/metrics), and sometimes per-feature or per-model monitoring. The biggest cost driver is often the operational overhead and retraining pipeline runs, not the drift metric itself.

Category: ai-ml

Difficulty: advanced

Related Terms

See Also