CloudWatch
Definition
AWS monitoring service that collects and tracks metrics, logs, and events from your applications and infrastructure for performance optimization.
Use Cases
- Netflix: Monitor the health and performance of large-scale services running on AWS, including detecting latency spikes and infrastructure issues. — Uses AWS-native monitoring (including CloudWatch metrics and alarms) as part of its broader observability approach to track service and infrastructure signals and trigger automated responses and on-call notifications. (Faster detection of incidents and performance regressions, improving service reliability at scale.)
- Airbnb: Operational monitoring and alerting for production workloads to catch errors and resource saturation before they impact users. — Uses AWS monitoring capabilities (including CloudWatch metrics/alarms) alongside internal tooling to track key service indicators and notify engineers when thresholds are breached. (Improved incident response and reduced time to detect operational issues.)
- NASA: Monitor AWS-hosted workloads supporting data processing and mission-related systems to ensure availability and performance. — Uses AWS monitoring services such as CloudWatch to collect metrics and set alarms for critical resources, integrating alerts into operational processes. (Better visibility into system health and quicker response to anomalies.)
Provider Equivalents
- AWS: Amazon CloudWatch
- Azure: Azure Monitor
- GCP: Google Cloud Observability (Cloud Monitoring and Cloud Logging)
- OCI: OCI Monitoring and OCI Logging
Frequently Asked Questions
- What's the difference between CloudWatch and AWS CloudTrail?
- CloudWatch focuses on observability: metrics (like CPU), logs (application/system logs), dashboards, and alarms. CloudTrail focuses on auditing: it records API calls and account activity (who did what, when, and from where). Use CloudWatch to detect performance/availability issues; use CloudTrail to investigate changes and security-related activity.
- When should I use CloudWatch?
- Use CloudWatch when you need to monitor AWS resources or applications, troubleshoot issues, or set alerts. Common scenarios include: alerting on EC2 CPU/memory (via agent) or disk usage, monitoring ALB request latency and 5xx errors, tracking Lambda errors/throttles, centralizing application logs, and creating dashboards for operational visibility.
- How much does CloudWatch cost?
- CloudWatch pricing is usage-based. Costs commonly come from: custom metrics (beyond many AWS-provided metrics), alarms, log ingestion and storage (CloudWatch Logs), log queries/analytics (for example, Logs Insights), dashboards, and optional features like detailed monitoring for some services. Your bill depends on how many metrics you publish, how much log data you ingest/store, how many alarms you run, and how often you query logs.
Category: monitoring
Difficulty: intermediate
See Also