Practice of intentionally introducing failures to test system resilience. Like conducting fire drills to ensure everyone knows how to respond to emergencies.
Netflix practices chaos engineering by randomly terminating servers in production to verify their systems can handle failures gracefully.
AWS FIS and Azure Chaos Studio are native managed services to run controlled fault-injection experiments. GCP and OCI commonly rely on running open-source or third-party chaos tools on Kubernetes/VMs, combined with their monitoring/alerting services.