Making trained AI models available to applications through APIs or services for making predictions. Like opening a restaurant that serves dishes created from tested recipes.
Model serving infrastructure hosts a language translation model that applications can call via API to translate text in real-time.
All providers offer managed endpoints to deploy trained models behind HTTPS APIs with autoscaling, monitoring, and security. Managed ML platforms (SageMaker/Azure ML/Vertex AI/OCI Data Science) focus on deploying your own models, while foundation-model services (Bedrock/Azure OpenAI/OCI Generative AI) provide hosted models accessed via API.