Latency

Definition

The delay between sending a request and receiving a response. Like the time it takes between asking a question and getting an answer.

Use Cases

Frequently Asked Questions

What's the difference between latency and bandwidth?
Latency is the time it takes for data to travel from sender to receiver (often measured in milliseconds). Bandwidth is how much data can be transferred per second (like Mbps or Gbps). Low latency makes interactions feel instant; high bandwidth helps move large files faster. You can have high bandwidth but still feel “lag” if latency is high.
When should I optimize for low latency?
Optimize for low latency when users need fast, interactive responses: online gaming, video calls, live trading, real-time dashboards, voice assistants, remote desktops, and APIs that power user interfaces. If your workload is batch processing (nightly jobs, backups), latency usually matters less than throughput and cost.
How much does latency cost?
Latency itself isn’t billed, but reducing it can change costs. Common cost drivers include using more regions (extra infrastructure), CDNs/edge services (request and data transfer fees), premium networking or dedicated links, and additional caching layers. Also note that data egress and cross-region traffic can increase costs when you move data closer to users.

Category: networking

Difficulty: basic

See Also