Amount of allowable downtime or errors before reliability targets are breached, used to balance reliability with feature development. Like having a spending allowance for risk.
With a 99.9% SLA, a service has 43 minutes of error budget per month - if it's depleted, teams focus on reliability instead of new features.
Error Budget is an SRE/SLI-SLO practice rather than a specific cloud service. All major clouds support implementing it using monitoring/observability (metrics, logs, traces), alerting, incident management, and SLO tooling.