Prediction
Definition
The output or answer generated by a machine learning model when provided with new data, reflecting the model's learned patterns and insights.
Use Cases
- Netflix: Personalized content recommendations (predicting what a member is likely to watch next) — Netflix uses machine learning models trained on viewing history, search behavior, and content metadata to generate predictions that rank titles for each user. Predictions are served at scale to personalize the home page and recommendations. (Improves content discovery and engagement by showing more relevant titles, which supports retention and viewing time.)
- Amazon: Product recommendations and ranking (predicting items a shopper is likely to buy) — Amazon applies ML prediction to user behavior signals (clicks, purchases, dwell time) and item attributes to rank and recommend products across pages such as 'Customers who bought this also bought'. (Increases conversion rates and average order value by surfacing more relevant products.)
- Uber: Estimated time of arrival (ETA) and demand forecasting (predicting trip duration and rider/driver demand) — Uber uses ML models that ingest real-time and historical data such as traffic patterns, time of day, location, and event signals to produce predictions used in routing, pricing, and marketplace balancing. (More accurate ETAs and better marketplace efficiency, improving rider experience and driver utilization.)
Provider Equivalents
- AWS: Amazon SageMaker (Real-time inference, Batch Transform)
- Azure: Azure Machine Learning (Online endpoints, Batch endpoints)
- GCP: Vertex AI (Online prediction, Batch prediction)
- OCI: OCI Data Science (Model Deployment, Batch inference via jobs)
Frequently Asked Questions
- What's the difference between prediction and inference in machine learning?
- They’re often used interchangeably. Inference is the process of running a trained model on new data. A prediction is the output produced by that inference process (for example, a probability score, a class label, or a numeric estimate).
- When should I use prediction (online vs batch)?
- Use online (real-time) prediction when you need an immediate response during an app interaction, such as fraud checks at checkout or recommendations on a web page. Use batch prediction when latency is not critical and you want to score many records at once, such as generating daily churn-risk scores for all customers.
- How much does prediction cost in the cloud?
- Cost depends on how you run predictions: (1) compute for hosting or running jobs (instance/CPU/GPU type and hours), (2) number of requests or volume of data scored, (3) memory and autoscaling settings, and (4) networking and storage for inputs/outputs. Real-time endpoints typically incur ongoing hosting costs even when idle, while batch prediction costs are usually tied to job runtime and resources used.
Category: ai-ml
Difficulty: basic
Related Terms
See Also