Cloudflare wrongly suspected that the widespread outage that took numerous websites offline on November 18 was caused by a DDoS attack, the company’s CEO has admitted. In his blog post that breaks down what happened, however, Matthew Prince explained that after realizing their mistake, his team was able to fix the issue. “The issue was not caused, directly or indirectly, by a cyber attack or malicious activity of any kind,” he wrote. It was instead caused by a change to its database systems’ permissions, which led to an issue with a file used by its Bot Management system.
The company’s Bot Management system uses a machine learning model to score bots for every request they make when they crawl Cloudflare’s network. Its clients rely on those bot scores to decide whether to allow or to block specific bots from accessing their websites. One the uses of having bot scores is being able to block AI companies’ bots so they can’t use a website’s content to train their LLMs. In July, Cloudflare launched an experiment called “pay per crawl,” which allows website owners to let an AI bot crawl their pages if they get paid for access.
Prince said the model relies on a “feature” configuration file to make a prediction on whether a bot request was automated or not. The feature file is refreshed every few minutes, and a change in the underlying mechanism generating that file caused a change in its size that triggered the error. “As a result, HTTP 5xx error codes were returned by the core proxy system that handles traffic processing for our customers, for any traffic that depended on the bots module,” Prince wrote.
This recent event has been Cloudflare’s worst outage in years. The company said it hasn’t had an outage that has “caused the majority of core traffic to stop flowing through [its] network” since 2019. Prince apologized for the issue on behalf of his team.

4 hours ago
7








English (US) ·