Service Mesh
Definition
A service mesh is an infrastructure layer that manages service-to-service communication in microservices architectures, enhancing security and
Use Cases
- Google: Secure and observe service-to-service communication across large microservices environments — Google has long used a service-mesh approach internally (e.g., Envoy-based data plane and service-to-service policies) and offers managed Istio via Anthos Service Mesh for similar patterns in customer environments (Improved reliability and operational consistency by standardizing traffic management, authentication (mTLS), and telemetry across services)
- Lyft: Managing microservice communication at scale with consistent routing, retries, and observability — Created Envoy (the widely used service-mesh proxy) and used it to standardize service-to-service networking behaviors across microservices (Better control over traffic behavior and improved debugging/visibility through consistent metrics and tracing across services)
Provider Equivalents
- AWS: AWS App Mesh
- Azure: Open Service Mesh (OSM) for Azure Kubernetes Service (AKS)
- GCP: Anthos Service Mesh
Frequently Asked Questions
- What's the difference between a Service Mesh and an API Gateway?
- An API Gateway mainly manages north-south traffic (requests coming into your system from users or external clients). A Service Mesh manages east-west traffic (service-to-service calls inside your system). Gateways focus on edge concerns like authentication for external clients, rate limiting, and request routing into the platform. A service mesh focuses on internal reliability and security features like mTLS between services, retries/timeouts, traffic splitting for canary releases, and consistent telemetry.
- When should I use a Service Mesh?
- Use a service mesh when you have many microservices and need consistent security and traffic controls without rewriting application code. Common triggers include: you need mTLS between services, standardized retries/timeouts/circuit breaking, canary or blue/green releases with traffic splitting, and unified metrics/tracing across services. If you have only a few services, minimal compliance needs, or limited platform engineering capacity, start with simpler approaches (library-based instrumentation, ingress + basic policies) and adopt a mesh later.
- How much does a Service Mesh cost?
- Costs come from (1) the managed service/control plane pricing (if using a cloud-managed mesh), (2) extra compute and memory for sidecar proxies or node-level agents, (3) data transfer and load balancer costs from added hops, and (4) observability costs (metrics, logs, traces storage and ingestion). Even with open-source meshes, you still pay for the additional CPU/RAM and operational overhead to run and monitor the mesh components.
Category: emerging
Difficulty: advanced
Related Terms
See Also