Scalability
Definition
The ability to automatically get more computing power when needed and less when not needed, optimizing resource use and cost efficiency.
Use Cases
- Netflix: Handle large, variable streaming demand (evenings, weekends, new releases) without service disruption. — Runs on AWS and uses Auto Scaling groups to add/remove EC2 instances behind load balancers based on demand and health checks; designs services to tolerate instance replacement. (Maintains performance during traffic spikes while avoiding paying for peak capacity all the time.)
- Amazon: Scale retail traffic for major events like Prime Day with large, sudden increases in shoppers and API requests. — Uses elastic, horizontally scalable architectures with automated scaling and load balancing across fleets of services to expand capacity during peaks and reduce it afterward. (Supports very high request volumes during events while improving cost efficiency by scaling down when demand drops.)
- Spotify: Scale music streaming and personalization workloads as user activity changes by time of day and region. — Uses cloud infrastructure with autoscaling compute for services and batch/stream processing so capacity can grow during high usage and shrink during quieter periods. (Keeps user experience responsive while reducing waste from overprovisioning.)
Provider Equivalents
- AWS: AWS Auto Scaling
- Azure: Azure Autoscale
- GCP: Google Cloud Autoscaler
- OCI: OCI Autoscaling
Frequently Asked Questions
- What's the difference between scalability and elasticity?
- Scalability is the ability of a system to handle growth by adding resources (more servers, bigger servers, or both). Elasticity is a type of scalability where capacity changes automatically and quickly to match demand (scale up during spikes, scale down after).
- When should I use scalability?
- Use scalability when your workload demand changes over time (daily peaks, seasonal traffic, marketing campaigns) or when you expect growth. It’s especially useful for web apps, APIs, e-commerce, streaming, and data processing jobs where you want to avoid outages during spikes and avoid paying for idle capacity during slow periods.
- How much does scalability cost?
- Scalability itself is usually enabled by services that may have little or no direct fee, but you pay for the resources that get added (compute instances, containers, database capacity, load balancers, and network egress). Costs depend on scaling frequency, peak size, how long you stay at peak, instance types, and whether you use cost optimizations like reserved capacity, savings plans/commitments, or spot/preemptible instances.
Category: cloud
Difficulty: basic
Related Terms
See Also