Service disruption on October 20, 2025
Incident.io’s postmortem of the Oct 20 AWS us‑east‑1 outage: although most of their platform ran from GCP, third‑party services hosted in AWS caused failures in SAML auth, on‑call notifications, and their Scribe transcription. They describe the incident timeline, traffic/alert spikes, queuing/backlog and deployment pipeline failures (Docker Hub base image retrieval), mitigations applied during the incident (no‑op SMS, split queues, scaling, rate‑limits) and permanent fixes implemented afterward (remove Docker Hub dependency, region switching for Scribe, telecom redundancy, DB/autovacuum and escalation acquisition fixes). They apologize and outline next steps.