Processing large volumes of data through an AI model all at once rather than one item at a time. Like grading a stack of exams together rather than waiting for students to submit them individually.
An e-commerce company runs batch inference overnight to generate product recommendations for all millions of users at once.
All four clouds support running offline (non-real-time) predictions over large datasets. AWS uses SageMaker Batch Transform jobs, Azure uses Azure ML batch endpoints, GCP uses Vertex AI Batch Prediction jobs, and OCI commonly runs batch scoring as scheduled Data Science Jobs that load a model and write predictions to object storage or a database.