Cloudflare outage on November 18, 2025
tl;dr: a configuration-generation tool had buggy error handling code. Triggered by a permissions change, it generated over-large configs which then caused a crash in buggy config-reading code in their Bot Management module. This configuration was rolled out globally within minutes.
As @kiall in ITC Slack notes: "the one thing I'd be pushing on after an outage like this (config mistake, propagated globally..) is "treat config like any other deployment - with a slow and steady rollout" -- and this is not called out in the postmortem. I agree this is a significant oversight.....
Tags: postmortem cloudflare outages configuration deployment cloud error-handling rollouts via:itc