Throughput

Definition

The amount of data that can be processed or transmitted in a given time period. Like the number of cars that can pass through a tunnel per hour.

Use Cases

Frequently Asked Questions

What's the difference between throughput and latency?
Throughput is how much work gets done per unit time (e.g., MB/s, requests/second, transactions/second). Latency is how long a single operation takes (e.g., milliseconds per request). A system can have high throughput but still have high latency if it processes many requests in parallel but each request takes a while.
When should I focus on throughput?
Focus on throughput when your workload must handle high volume: many users at once, large data transfers, batch processing, streaming, or high transaction rates. Typical signs are queue backlogs, saturated network links, storage bandwidth limits, or databases hitting read/write capacity. If users complain about slow individual requests, start by checking latency; if the system can’t keep up with total demand, prioritize throughput.
How much does throughput cost?
Throughput itself isn’t billed, but achieving higher throughput often increases cost. Common cost drivers include: (1) larger or more instances to process more requests per second, (2) higher-tier storage or provisioned IOPS/throughput options, (3) more database capacity units or replicas, (4) load balancers and autoscaling capacity, and (5) network egress charges for high data transfer out of the cloud. Pricing depends on the specific service (compute, database, storage, networking) and whether capacity is on-demand, provisioned, or reserved.

Category: networking

Difficulty: intermediate

Related Terms

See Also