Processing large volumes of data through an AI model all at once rather than one item at a time. Like grading a stack of exams together rather than waiting for students to submit them individually.
An e-commerce company runs batch inference overnight to generate product recommendations for all millions of users at once.
All four options run offline (non-real-time) predictions over large datasets. AWS, Azure, and GCP provide managed batch prediction features tied to model endpoints/artifacts, while OCI commonly implements batch inference by running a scheduled Data Science Job that loads a model and scores data in bulk.