Auto Scaling

Definition

Automatically adjusting the number of servers based on demand. Like a restaurant that opens more tables during busy hours and closes them when it's quiet.

Use Cases

Provider Equivalents

Frequently Asked Questions

What's the difference between Auto Scaling and load balancing?
Load balancing distributes incoming traffic across existing servers so no single server gets overwhelmed. Auto Scaling changes how many servers you have. They’re often used together: the load balancer spreads traffic, and Auto Scaling adds or removes servers as demand changes.
When should I use Auto Scaling?
Use Auto Scaling when your workload changes over time (daily peaks, seasonal events, marketing campaigns), when you need high availability (replace unhealthy instances automatically), or when you want to reduce costs by not running peak capacity 24/7. It’s especially useful for web apps, APIs, and batch workers with variable queues.
How much does Auto Scaling cost?
In many cases, the scaling feature itself has no additional charge (for example, AWS EC2 Auto Scaling doesn’t add a separate fee), but you pay for the resources it launches: compute instances, attached storage, load balancers, and monitoring/metrics (such as detailed monitoring or custom metrics). Costs depend on instance type, how long extra capacity runs, scaling frequency, and any supporting services.

Category: cloud

Difficulty: intermediate

Related Terms

See Also