Failback

Definition

Returning operations to the primary system after it has been restored. Like moving back home after evacuation when it's safe to return.

Use Cases

Frequently Asked Questions

What's the difference between failback and failover?
Failover is the switch from the primary system to a standby/secondary system when there’s a problem. Failback is the planned move back to the primary system after it has been repaired and verified as stable. Failover is usually urgent; failback is usually controlled and scheduled.
When should I use failback?
Use failback after a failover event (or DR activation) once the primary environment is fully restored, data is synchronized, and you’ve validated application health. It’s best done during low-traffic windows with a rollback plan, clear success criteria (latency, error rate, replication lag), and stakeholder communication.
How much does failback cost?
Failback cost depends on how your DR is designed. Common cost drivers include: ongoing replication and storage in the secondary site/region, data transfer/egress charges between regions or providers, running standby compute (warm/hot standby costs more than cold), additional licensing for DR tooling, and the operational effort to test and execute failback. A controlled failback may also incur temporary double-running costs while both environments are active during validation.

Category: cloud

Difficulty: intermediate

Related Terms

See Also