azure
System Design
beginner
API rate limiting and abuse prevention

Rate Limiter System

System Design Classic

A rate limiter protects APIs from abuse and ensures fair resource allocation across clients. This Azure-native design implements multiple algorithms — token bucket for smooth rate limiting, sliding window log for precise counting, and fixed window counter for simplicity. The distributed implementation uses Azure Redis Cache for atomic counter operations across multiple API Management instances, with per-client and per-endpoint configurable limits.

Data Flow

API Management
Rate Limit Checker
Limit Metrics
Counter Store (Redis)
Rate Limit Rules

Share this architecture with your network

Service Breakdown (5 services)

Other5 services
API Management
  • Exposes backend services through managed API endpoints
  • Enforces authentication, throttling, and quotas
  • Provides developer portal and API analytics
Rate Limit Checker
  • Executes event-driven functions without managing servers
  • Scales based on event volume with consumption billing
  • Supports durable functions for stateful workflows
Counter Store (Redis)
  • Caches frequently accessed data in-memory
  • Reduces database round-trips and latency
  • Supports TTL-based expiration policies
Rate Limit Rules
  • Provides globally distributed multi-model database
  • Guarantees single-digit ms reads worldwide
  • Supports five consistency levels
Limit Metrics
  • Tracks API call rates and quota consumption
  • Emits alerts when rate limits are approached
  • Provides dashboards for throttling visibility

Scaling Strategy

Azure Redis Cache provides atomic increment operations for counter-based rate limiting across all API Management instances. Rate limit rules are stored in Cosmos DB and cached locally with short TTLs. The token bucket algorithm runs as an Azure Functions policy at the API Management layer, adding zero latency for requests within limits. Azure Monitor tracks rate limit hits for alerting and capacity planning.

Related Architectures