Ability to understand what's happening inside a system by examining its outputs like logs, metrics, and traces. Like having security cameras, temperature sensors, and activity logs to understand everything happening in a building.
Good observability lets you see not just that your website is slow, but exactly which database query is causing the problem and why.
All major clouds provide an observability stack covering metrics (performance numbers), logs (event records), and traces (request paths). AWS commonly combines CloudWatch (metrics/logs), X-Ray (tracing), and CloudTrail (API/audit). Azure uses Azure Monitor with Application Insights for app telemetry and Activity Log for control-plane events. GCP bundles these as Google Cloud Observability. OCI provides Monitoring/Logging/APM with Audit for control-plane tracking.