BTC/USD $68,420 +2.8%
ETH/USD $3,540 +1.4%
SOL/USD $142.80 -0.6%
BNB/USD $605.20 +0.9%
XRP/USD $0.62 -1.2%
DOGE/USD $0.18 +5.4%
BTC/USD $68,420 +2.8%
ETH/USD $3,540 +1.4%
SOL/USD $142.80 -0.6%
BNB/USD $605.20 +0.9%
XRP/USD $0.62 -1.2%
DOGE/USD $0.18 +5.4%
Policy

Cloudflare just gave AI companies an ultimatum on scraping

A Hard Deadline for Mixed-Use Crawlers Cloudflare has issued the AI industry a deadline to separate the web crawlers used for traditional search from those used for AI agents and model traini

AnonymousCryptoCompass newsroom
July 2, 2026
3 min read
NEWS
Cloudflare just gave AI companies an ultimatum on scraping
CryptoCompass editorial visual for policy coverage.

A Hard Deadline for Mixed-Use Crawlers

Cloudflare has issued the AI industry a deadline to separate the web crawlers used for traditional search from those used for AI agents and model training. Starting September 15, 2026, its default settings will block mixed-use crawlers from any pages that host ads.Crawlers that blend search, agent use, and training will be blocked by default unless the site owner adjusts settings, and the changes apply to new customers, new sites set up by existing customers, and all existing free-tier users.

The move follows a broader shift in how Cloudflare frames the relationship between publishers and AI companies. Whereas the internet previously rewarded creators by directing users to original websites, AI crawlers now collect text, articles, and images to generate responses without sending users to the source, depriving publishers of traffic and advertising revenue.

While no company is named directly, the implication is clear. Cloudflare singled out Google for criticism, arguing that the search giant's dominance gives it access to roughly twice the web content available to other AI companies, because remaining discoverable in search effectively requires consenting to AI use.Google has pushed back, noting that it provides a bot called Google-Extended that lets site owners opt out of having their content used for AI training and products like Gemini, without affecting a site's inclusion in Google Search.

The scale of Cloudflare's position amplifies the policy's significance. Cloudflare handles roughly 20 percent of web traffic, giving it unusual leverage to reshape how AI companies access publisher content at scale. To underscore the stakes, Cloudflare cited crawl-to-referral ratios showing Google crawls sites about 14 times for every referral, while OpenAI's ratio stands at 1,700:1 and Anthropic's at 73,000:1.

From Pay Per Crawl to Pay Per Use

A year ago, Cloudflare launched Pay Per Crawl so publishers could charge AI companies for crawling their content. Now, it is evolving that into Pay Per Use, so publishers are paid when their content actually creates value, not just when it is fetched.

Cloudflare data suggests that over 50 percent of crawl traffic from AI crawlers is spent re-fetching unchanged pages, resulting in wasted bandwidth for publishers and wasted compute for AI companies. The Pay Per Use model is designed to correct that inefficiency by tying compensation to outcomes rather than activity.

Cloudflare is evolving Pay Per Crawl into a broader Pay Per Use model, where publishers could be paid when their content appears in an AI result or when an agent purchases premium information for a specific task. Ceramic.ai and You.com are among the first partners in the program.An Attribution Business Insights dashboard is also planned to show how AI bots access content, where that content is cited, and how much human traffic different AI platforms return.

The policy represents a significant structural challenge for AI companies that have built data pipelines around bulk, unrestricted web access. The change shifts web-crawl access from implicit to explicit and raises the operational cost of sourcing web-scale training data, since pipelines built on bulk crawling will need opt-in checks, credentials, or paid access.

Sources:TechCrunch: Cloudflare's new policy pushes AI companies to pay for publishers' contentCloudflare Blog: Control content use for AI training with managed robots.txt and blocking for monetized contentCloudflare Press Release: Your Content, Your Rules