The 499 status code is the quietest reason your site is missing from AI search. ChatGPT, Claude, and Perplexity all fetch pages in real time when they answer questions, and if your origin does not respond fast enough, the agent disconnects before the bytes arrive. The 499 in your logs is the marker. Profound's research, covered in the 499 eligibility post on this site, found that pages with very high failure rates against AI fetchers receive roughly an order of magnitude fewer citation events.
The single highest-leverage fix for sites on Cloudflare is a Cache Rule that makes AI bot HTML requests eligible for the edge cache, with TTL bands appropriate to the content type and explicit exclusions for dynamic surfaces (cart, checkout, account, admin). I have written this rule on enough domains now that hand-typing it is annoying. The Cloudflare AI Cache Rule Generator at /tools/cloudflare-ai-cache-rule-generator/ writes it for you.
What the tool does
You enter your domain. You pick a content profile (news, blog, evergreen, ecommerce, local). You choose which AI bot user agents to include. You confirm the path exclusions. You click generate. Out comes:
- Dashboard expression — paste directly into Cloudflare Caching > Cache Rules > Create rule > Edit expression
- Terraform — drop into your Cloudflare provider config; you replace zone_id and that is it
- cURL — for the Cloudflare API; replace ZONE_ID and CF_API_TOKEN, run, done
- Apply checklist — a numbered list of every step you need to take in the dashboard, including the cache settings (Cache eligibility, Cache level, Edge Cache TTL, Browser Cache TTL) the expression alone does not configure
The TTL bands by profile are sensible defaults: 1 hour for news, 1 day for blogs and guides, 7 days for evergreen reference, 30 minutes for ecommerce category and PDP pages (where stale prices are a real concern), 7 days for local-business pages.
What this fix actually changes
For human users, nothing. The rule is scoped to AI bot user agents via the http.user_agent contains clause and excludes dynamic paths via not (http.request.uri.path contains) clauses.
For AI bots, the change is dramatic. A cached HTML response served from Cloudflare's edge has Time to First Byte measured in tens of milliseconds, regardless of how slow your origin is. The 499 timeout disappears for cached pages. The bot moves from "did not arrive in time, drop from candidate set" to "arrived in 80ms, included in candidate set."
This is the difference between being eligible for citation and being invisible.
What it does not do
The rule does not change which AI bots can crawl your site. That is governed by ai.txt, robots.txt, and your Cloudflare AI Crawl Control settings, all separate from cache behavior.
The rule does not enable Cloudflare's Markdown for Agents feature. That is a separate toggle in Caching > AI Crawl Control on Pro, Business, and Enterprise plans. The Cache Rule and Markdown for Agents are complementary; using both is the recommended pattern.
The rule does not magically fix slow origins. If your origin's 99th-percentile response time on long-tail URLs is 8 seconds, the cache rule helps everything that is in the cache, and the long tail still hits origin and still risks 499s. The cache rule plus origin profiling is the durable fix.
What is in the expression
A typical generated expression looks like this:
http.host eq "example.com"
and (
http.user_agent contains "GPTBot" or
http.user_agent contains "ChatGPT-User" or
http.user_agent contains "OAI-SearchBot" or
http.user_agent contains "ClaudeBot" or
http.user_agent contains "Claude-User" or
http.user_agent contains "PerplexityBot" or
http.user_agent contains "Applebot-Extended"
)
and not (http.request.uri.path contains "/cart")
and not (http.request.uri.path contains "/checkout")
and not (http.request.uri.path contains "/account")
and not (http.request.uri.path contains "/wp-admin")
and not (http.request.uri.path contains "/admin")
and not (http.request.uri.path contains "/api")
The bot list mirrors what shows up in real-world edge logs. The exclusion list is the standard set of dynamic paths to keep out of bot caching. Both are configurable in the tool.
How to verify the fix worked
The Apply checklist tab includes the validation queries. Twenty-four hours after deploying the rule, segment your Cloudflare Analytics by user agent and confirm:
- Cache HIT rate against AI bot user agents has increased
- 499 rate against AI bot user agents has decreased
- Origin request rate has decreased
- No surprise increases in stale-content complaints from human users (this is the rule's blast-radius check)
If the data lines up, you have eliminated one of the largest invisible reasons your site was missing from AI answers. If the cache HIT rate did not increase, the most likely cause is a misconfigured Cache Level (set to Standard rather than Cache Everything) or origin headers overriding the edge TTL.
Related reading
- Your Site Might Be Invisible To ChatGPT for the underlying 499 problem and Mike King's framing of "eligibility is the new ranking"
- The Conversation Has Moved Past The Model for the broader runtime-layer shift Cloudflare and OpenAI both shipped in April 2026
- Agent Runtime Readiness audit to test whether your individual pages survive an agent runtime
- The Best MCP Servers By Industry for the related agent-tooling shift
Fact-check notes and sources
- Cloudflare Cache Rules documentation: Cloudflare developers — Cache Rules
- Cloudflare Markdown for Agents: docs, announcement blog
- Cloudflare Agent Readiness score: blog announcement
- Profound research on AI fetch and citation: How ChatGPT sources the web
- Mike King's framing: Optimizing the New Search
Cache configuration affects production traffic. Apply on a staging hostname or behind a feature flag before promoting. Verify with Cache Status logging (HIT, MISS, BYPASS, EXPIRED, REVALIDATED) before declaring the rule deployed.