Saga Pattern
Definition
The Saga Pattern is a design approach for managing distributed transactions across multiple services, ensuring data consistency and reliability.
Use Cases
- Uber: Coordinating multi-step operations across microservices (e.g., trip lifecycle actions that touch dispatch, payments, and notifications) without using distributed locks or 2-phase commit. — Uber engineers have described using the Saga concept in microservice architectures: a business process is split into local transactions in each service, with compensating actions to undo work when a later step fails. This is commonly paired with asynchronous messaging and careful idempotency handling. (Improved resilience and scalability for cross-service workflows by avoiding tightly coupled distributed transactions; failures can be handled with compensations and retries rather than blocking the whole system.)
- Netflix: Managing long-running, multi-service workflows (e.g., provisioning and operational processes) in a distributed system where steps may fail and need recovery actions. — Netflix has publicly discussed orchestration-style workflow engines (notably Conductor) used to coordinate tasks across services. While not always labeled as 'saga' in every write-up, the approach aligns with saga principles: stepwise execution, retries, and explicit failure handling/compensation logic for distributed operations. (More reliable execution of complex workflows at scale, with better visibility into workflow state and improved operational control over retries and failure handling.)
Frequently Asked Questions
- What's the difference between Saga Pattern and two-phase commit (2PC)?
- Two-phase commit tries to make multiple services commit a transaction atomically, which often requires locking and tight coordination. A saga avoids a single global transaction: each service commits its own local transaction, and if something fails later, the system runs compensating actions to undo earlier steps. Sagas usually scale better and are more fault-tolerant in microservices, but they provide eventual consistency rather than strict atomicity.
- When should I use Saga Pattern?
- Use a saga when a business process spans multiple services or databases and you can’t (or shouldn’t) rely on a single ACID transaction. It’s a good fit for long-running workflows (seconds to days), high-scale microservices, and scenarios where eventual consistency is acceptable (e.g., travel booking, order fulfillment). Avoid it when you truly need strict atomicity across resources or when compensating actions are impossible or unsafe.
- How much does Saga Pattern cost?
- The pattern itself is free, but implementing it has costs: (1) workflow/orchestration runtime (e.g., state machine/workflow executions), (2) messaging (queues/pub-sub), (3) data storage for saga state, outbox/inbox tables, and logs, (4) compute for workers/handlers, and (5) engineering/operational overhead (testing compensations, observability, retries, and idempotency). Costs scale with the number of workflow steps, message volume, retries, and how long you retain state and logs.
Category: software
Difficulty: advanced
Related Terms
See Also