The maximum amount of text (measured in tokens) that an AI model can process in a single conversation or request. Like the size of a desk - determines how many documents you can spread out at once.
GPT-4 Turbo has a 128K token context window, allowing it to analyze an entire book or codebase in one conversation.
Context window is a model capability (maximum tokens the model can read/use at once), not a standalone cloud service. All major clouds expose it through their managed LLM offerings (e.g., AWS via Amazon Bedrock models, Azure via Azure OpenAI models, GCP via Vertex AI models, OCI via Generative AI models), but the exact context size depends on the specific model you choose.