Question 1

What's the difference between prediction and inference in machine learning?

Accepted Answer

They’re often used interchangeably. Inference is the process of running a trained model on new data. A prediction is the output produced by that inference process (for example, a probability score, a class label, or a numeric estimate).

Question 2

When should I use prediction (online vs batch)?

Accepted Answer

Use online (real-time) prediction when you need an immediate response during an app interaction, such as fraud checks at checkout or recommendations on a web page. Use batch prediction when latency is not critical and you want to score many records at once, such as generating daily churn-risk scores for all customers.

Question 3

How much does prediction cost in the cloud?

Accepted Answer

Cost depends on how you run predictions: (1) compute for hosting or running jobs (instance/CPU/GPU type and hours), (2) number of requests or volume of data scored, (3) memory and autoscaling settings, and (4) networking and storage for inputs/outputs. Real-time endpoints typically incur ongoing hosting costs even when idle, while batch prediction costs are usually tied to job runtime and resources used.

Prediction

Definition

Use Cases

Provider Equivalents

Frequently Asked Questions

Related Terms

See Also