Canvas CloudAI
Canvas Cloud AI

Model Serving

advanced
ai & ml
Enhanced Content

Definition

Making trained AI models available to applications through APIs or services for making predictions. Like opening a restaurant that serves dishes created from tested recipes.

Real-World Example

Model serving infrastructure hosts a language translation model that applications can call via API to translate text in real-time.

Related Terms

Cloud Provider Equivalencies

All providers offer managed endpoints that host models and expose HTTPS APIs for inference. ML platforms (SageMaker/Azure ML/Vertex AI/OCI Data Science) focus on deploying your trained models with scaling, monitoring, and versioning. Foundation-model services (Bedrock/Azure OpenAI/OCI Generative AI) provide hosted models you call via API without managing the underlying model servers.

AWS
Amazon SageMaker (Real-Time Inference, Serverless Inference, Batch Transform) and Amazon Bedrock (model invocation APIs for foundation models)
AZ
Azure Machine Learning (Online Endpoints, Managed Online Endpoints) and Azure AI Studio/Azure OpenAI (model deployment/inference endpoints for foundation models)
GCP
Vertex AI (Online Prediction Endpoints, Batch Prediction) and Google Cloud Run/GKE for custom serving
OCI
OCI Data Science (Model Deployment) and OCI Generative AI (inference APIs for foundation models)

Explore More Cloud Computing Terms