Business Continuity
Definition
Strategy for keeping critical business operations running during and after disasters or disruptions, safeguarding organizational stability.
Use Cases
- Netflix: Maintain streaming availability despite infrastructure failures — Designed for resilience using microservices and redundancy across AWS Availability Zones, with automated failover and continuous testing of failure scenarios (commonly referred to as chaos engineering). (Improved ability to continue serving customers during component failures and reduced impact of outages through proactive resilience testing.)
- Zoom: Scale and maintain reliable video conferencing during pandemic-driven traffic spikes — Expanded capacity and operations to support rapid growth, using a mix of cloud infrastructure and distributed architecture practices to keep services available while demand surged. (Sustained service continuity for a rapidly growing user base and supported widespread remote work and education needs.)
Frequently Asked Questions
- What's the difference between Business Continuity and Disaster Recovery (DR)?
- Business Continuity is the overall plan to keep critical business functions running during disruptions (people, process, technology). Disaster Recovery is a subset focused specifically on restoring IT systems and data after an incident. BC covers things like remote-work procedures and customer support workflows; DR covers backups, replication, and failover.
- When should I use Business Continuity planning?
- Use it whenever downtime would harm revenue, safety, compliance, or customer trust. Start early if you run customer-facing apps, handle payments or sensitive data, operate 24/7 services, or have regulatory requirements. Even small teams benefit from a basic plan that defines critical systems, recovery priorities, and communication steps.
- How much does Business Continuity cost?
- Costs vary based on your recovery goals and architecture. Key factors include: redundancy (single-region vs multi-region), backup storage and retention, replication and failover services, additional compute kept warm/active, network egress during recovery, tooling for monitoring/incident response, and ongoing testing/training time. A basic plan with backups and documented procedures is relatively low cost; high-availability multi-region designs are more expensive but reduce downtime risk.
Category: software
Difficulty: basic
Related Terms
See Also