← Back to Blog

I Got Tired Of Hand-Writing Cloudflare Cache Rules For AI Bots. So I Built A Generator.

I Got Tired Of Hand-Writing Cloudflare Cache Rules For AI Bots. So I Built A Generator.

The 499 status code is the quietest reason your site is missing from AI search. ChatGPT, Claude, and Perplexity all fetch pages in real time when they answer questions, and if your origin does not respond fast enough, the agent disconnects before the bytes arrive. The 499 in your logs is the marker. Profound's research, covered in the 499 eligibility post on this site, found that pages with very high failure rates against AI fetchers receive roughly an order of magnitude fewer citation events.

The single highest-leverage fix for sites on Cloudflare is a Cache Rule that makes AI bot HTML requests eligible for the edge cache, with TTL bands appropriate to the content type and explicit exclusions for dynamic surfaces (cart, checkout, account, admin). I have written this rule on enough domains now that hand-typing it is annoying. The Cloudflare AI Cache Rule Generator at /tools/cloudflare-ai-cache-rule-generator/ writes it for you.

What the tool does

You enter your domain. You pick a content profile (news, blog, evergreen, ecommerce, local). You choose which AI bot user agents to include. You confirm the path exclusions. You click generate. Out comes:

  1. Dashboard expression — paste directly into Cloudflare Caching > Cache Rules > Create rule > Edit expression
  2. Terraform — drop into your Cloudflare provider config; you replace zone_id and that is it
  3. cURL — for the Cloudflare API; replace ZONE_ID and CF_API_TOKEN, run, done
  4. Apply checklist — a numbered list of every step you need to take in the dashboard, including the cache settings (Cache eligibility, Cache level, Edge Cache TTL, Browser Cache TTL) the expression alone does not configure

The TTL bands by profile are sensible defaults: 1 hour for news, 1 day for blogs and guides, 7 days for evergreen reference, 30 minutes for ecommerce category and PDP pages (where stale prices are a real concern), 7 days for local-business pages.

What this fix actually changes

For human users, nothing. The rule is scoped to AI bot user agents via the http.user_agent contains clause and excludes dynamic paths via not (http.request.uri.path contains) clauses.

For AI bots, the change is dramatic. A cached HTML response served from Cloudflare's edge has Time to First Byte measured in tens of milliseconds, regardless of how slow your origin is. The 499 timeout disappears for cached pages. The bot moves from "did not arrive in time, drop from candidate set" to "arrived in 80ms, included in candidate set."

This is the difference between being eligible for citation and being invisible.

What it does not do

The rule does not change which AI bots can crawl your site. That is governed by ai.txt, robots.txt, and your Cloudflare AI Crawl Control settings, all separate from cache behavior.

The rule does not enable Cloudflare's Markdown for Agents feature. That is a separate toggle in Caching > AI Crawl Control on Pro, Business, and Enterprise plans. The Cache Rule and Markdown for Agents are complementary; using both is the recommended pattern.

The rule does not magically fix slow origins. If your origin's 99th-percentile response time on long-tail URLs is 8 seconds, the cache rule helps everything that is in the cache, and the long tail still hits origin and still risks 499s. The cache rule plus origin profiling is the durable fix.

What is in the expression

A typical generated expression looks like this:

http.host eq "example.com" 
and (
  http.user_agent contains "GPTBot" or 
  http.user_agent contains "ChatGPT-User" or 
  http.user_agent contains "OAI-SearchBot" or 
  http.user_agent contains "ClaudeBot" or 
  http.user_agent contains "Claude-User" or 
  http.user_agent contains "PerplexityBot" or 
  http.user_agent contains "Applebot-Extended"
) 
and not (http.request.uri.path contains "/cart") 
and not (http.request.uri.path contains "/checkout") 
and not (http.request.uri.path contains "/account") 
and not (http.request.uri.path contains "/wp-admin") 
and not (http.request.uri.path contains "/admin") 
and not (http.request.uri.path contains "/api")

The bot list mirrors what shows up in real-world edge logs. The exclusion list is the standard set of dynamic paths to keep out of bot caching. Both are configurable in the tool.

How to verify the fix worked

The Apply checklist tab includes the validation queries. Twenty-four hours after deploying the rule, segment your Cloudflare Analytics by user agent and confirm:

  • Cache HIT rate against AI bot user agents has increased
  • 499 rate against AI bot user agents has decreased
  • Origin request rate has decreased
  • No surprise increases in stale-content complaints from human users (this is the rule's blast-radius check)

If the data lines up, you have eliminated one of the largest invisible reasons your site was missing from AI answers. If the cache HIT rate did not increase, the most likely cause is a misconfigured Cache Level (set to Standard rather than Cache Everything) or origin headers overriding the edge TTL.

Related reading

Fact-check notes and sources

Cache configuration affects production traffic. Apply on a staging hostname or behind a feature flag before promoting. Verify with Cache Status logging (HIT, MISS, BYPASS, EXPIRED, REVALIDATED) before declaring the rule deployed.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026