Bulkhead
Definition
The Bulkhead design pattern isolates critical resources in a system, preventing total failure by ensuring that issues in one area do not affect others.
Use Cases
- Netflix: Preventing one failing dependency from cascading and taking down user-facing streaming experiences — Netflix popularized resilience patterns in microservices, including isolating calls to downstream dependencies using separate thread pools and concurrency limits (commonly associated with the bulkhead pattern) so that a slow or failing dependency cannot consume all request-handling capacity. (Reduced blast radius of failures and improved overall service resilience during partial outages or dependency degradation.)
- Amazon: Keeping checkout and payments available even when other site components degrade — Large e-commerce architectures commonly isolate critical paths (checkout/payment) from less critical subsystems (recommendations, browsing, catalog enrichment) using separate service tiers, independent scaling, and strict capacity limits to prevent shared resource exhaustion. (Higher availability for revenue-critical transactions and fewer site-wide outages caused by non-critical component failures.)
- Uber: Maintaining core trip and dispatch functionality during partial service degradation — At large scale, services are typically segmented so that critical request paths have dedicated capacity and are protected by timeouts, concurrency limits, and independent scaling, reducing the chance that one overloaded subsystem starves others of compute or connection pools. (Improved reliability of core user flows and reduced cascading failures during traffic spikes or downstream incidents.)
Frequently Asked Questions
- What's the difference between the Bulkhead pattern and a Circuit Breaker?
- Bulkhead isolates resources so one part of the system can’t use up all capacity (for example, separate thread pools or separate service deployments). A circuit breaker detects repeated failures/slowdowns to a dependency and temporarily stops calling it to avoid wasting resources. Bulkheads limit the blast radius; circuit breakers stop repeated damage.
- When should I use the Bulkhead pattern?
- Use bulkheads when you have shared resources that could be exhausted (threads, database connections, CPU, memory, request quotas) and when some functions are more critical than others (payments vs. recommendations). It’s especially useful in microservices, multi-tenant systems, and any system with unpredictable traffic spikes or unreliable downstream dependencies.
- How much does the Bulkhead pattern cost?
- There’s no direct license cost for the pattern itself, but it can increase infrastructure spend because you may run separate service instances, separate node pools, separate databases, or reserve capacity for critical workloads. Costs depend on how you isolate (extra compute, extra load balancers, extra databases/replicas) and the amount of headroom you keep to ensure critical paths remain available.
Category: software
Difficulty: advanced
Related Terms
See Also