The audit tools on this site fetch a URL through a serverless proxy, then parse and score whatever HTML comes back. That works for most sites. It does not work in two specific situations, and the fix is different for each.
This post walks through both fixes at a complete-beginner level. No DevTools experience required.
Two reasons the auditor sees an empty or wrong page
Reason 1: bot protection blocks the probe. Cloudflare, AWS WAF, Akamai, Imperva, Sucuri, DataDome, Vercel Firewall, Wordfence, and similar layers serve a small JavaScript challenge to anything that does not look like a real Chrome session. The audit proxy is not a real Chrome session, so it gets the challenge body, not your page.
Reason 2: your site is a JavaScript-rendered shell. Popmenu, Pixieset, Squarespace, Wix, Webflow, Shopify Hydrogen, Next.js without SSR, Nuxt without SSR — all of these return a thin HTML wrapper that hydrates into the real page only after JavaScript runs in a browser. The audit proxy receives the wrapper, not the rendered page.
In both cases the analyzer banner tells you what happened. The fix path is different:
- For reason 1 (bot block): Path A (paste your own HTML) usually solves it. Path B (temporarily relax your CDN rule) is the alternative when you need a live fetch.
- For reason 2 (SPA shell): only Path A works. The CDN is not the problem; the missing content lives in JavaScript. You must capture the rendered DOM.
If you only want one of these and you know which one, jump to the relevant section. Otherwise read in order.
A blocked audit is also a blocked AI retriever
If a tool on this site shows you a "blocked by WAF" banner, the same block is hitting GPTBot, ClaudeBot, OAI-SearchBot, ChatGPT-User, Claude-User, PerplexityBot, Perplexity-User, Applebot, Amazonbot, MistralAI-User, Meta's external agent, and the open-source crawlers (CCBot, Bytespider). They all fetch your page from a non-browser context the same way the audit proxy does. They get the same challenge body, read the same noindex,nofollow meta tag, and walk away.
That is the most important takeaway in this post: a blocked audit is a leading indicator that your site is invisible to AI search. Even if you do not care about the audit, the audit being blocked tells you something measurable about how AI engines see your site.
The companion post How to allowlist AI crawlers without weakening bot protection covers the long-term fix for that. The 499 problem covers the metric to watch in your logs. This post focuses on the short-term: getting an actual audit done today on a site you own.
Path A: save your rendered HTML
This is the safer path. You open the page in your own browser, copy the rendered HTML, and paste it into a tool that scores HTML directly. No CDN settings change. Works for both bot-blocked sites and SPA shells.
The tool you paste into for the full audit: Mega Analyzer paste mode. It opens the paste panel automatically and scores your rendered HTML against every dimension — SEO, schema, E-E-A-T, voice, mobile parity, performance + AI, crawlability. Same checks the regular Mega Analyzer runs against a URL fetch.
For a quick rendered-vs-static diff (handy when you want to see specifically what JavaScript adds versus what the server sent), use Rendered DOM Paste Audit — that one runs ~10 fast checks plus the static-vs-rendered comparison if you also paste a URL.
Before you save: there are two flavors of "the HTML." View Source shows you the original HTML the server sent. Rendered DOM (also called "outer HTML") shows you the page after JavaScript has run. For a static site they are nearly identical. For a Squarespace, Wix, Popmenu, Pixieset, Shopify Hydrogen, or Next.js CSR site, View Source will look almost empty and the Rendered DOM is the one you want. When in doubt, use Rendered DOM. It always contains at least as much as View Source.
Chrome, Edge, Brave, Vivaldi, Arc, Opera (Chromium browsers)
- Open the page you want to audit in a normal browser tab. Wait for it to fully load — give Wix / Squarespace / a Pixieset gallery / a Shopify product grid an extra two or three seconds to render.
- Right-click anywhere on the page and choose Inspect from the context menu. The DevTools panel opens at the bottom or the right side of the window.
- In the Elements tab (the leftmost tab in DevTools), find the very first line that says
<html …>. It will be at the top. - Right-click that
<html>line. Pick Copy → Copy outerHTML from the submenu. - Open Rendered DOM Paste Audit, paste into the textarea, and run.
If you only want View Source instead, the keyboard shortcut is Ctrl + U on Windows / Linux, Cmd + Option + U on Mac. Then Ctrl + A (or Cmd + A), Ctrl + C (or Cmd + C), paste.
Firefox
- Open the page. Wait for it to load.
- Right-click → Inspect (or press
Ctrl + Shift + C/Cmd + Option + C). The Inspector tab opens. - In the Inspector pane, find the
<html>tag near the top. - Right-click that line → Copy → Outer HTML.
- Paste into Rendered DOM Paste Audit and run.
For View Source: Ctrl + U / Cmd + U. Same select-all-and-copy as above.
Safari (Mac)
- Open Safari, then Safari → Settings → Advanced and tick Show features for web developers. (One-time setup.)
- Open the page. Wait for it to load.
- Right-click → Inspect Element (or press
Cmd + Option + I). The Web Inspector opens. - In the Elements tab, find the
<html>line. - Right-click → Copy → Outer HTML.
- Paste into the tool and run.
iPhone / iPad (Safari on iOS)
iOS Safari does not let you copy outer HTML directly. Two workarounds:
- Use the Share sheet → Markup or simply screenshot the page for visual review. This does NOT work for the audit tool — you need text HTML.
- The proper path: connect your iPhone to a Mac, open Safari on Mac, then Develop → [your iPhone] → [the open tab]. The Mac Web Inspector now controls the iPhone tab and you can Copy → Outer HTML the same way as on Mac. This requires both devices on the same Apple ID.
If neither is practical, audit the same page from a desktop browser instead. The DOM rarely differs in a way that affects audit results.
Android (Chrome on phone)
- Plug the phone into a Windows/Mac/Linux computer with a USB cable.
- On the phone: Settings → Developer Options → enable USB debugging. (You may need to first tap the build number 7 times in About Phone to enable Developer Options.)
- On the desktop, open Chrome and visit
chrome://inspect. Under Devices you should see your phone's open Chrome tab. Click Inspect. - The desktop now controls the phone's tab. Use Elements → right-click
<html>→ Copy → Copy outerHTML.
Same caveat as Safari: if remote inspect is overkill, audit the desktop version of the page instead.
Command line (no GUI)
If you prefer the terminal and the page is NOT a JavaScript-rendered shell:
curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36" \
-H "Accept: text/html,application/xhtml+xml" \
-L --max-time 15 \
"https://yoursite.com/" > page.html
Open page.html in a browser to confirm it is the real content rather than a challenge response, then paste into the tool.
If the page IS a JavaScript-rendered shell, curl will not capture the rendered DOM — JavaScript does not run in curl. Use the browser DevTools route above.
What if my platform is one of these?
A short cheat sheet of common platforms and what to expect:
- Popmenu (restaurant menus): thin shell, must use Rendered DOM Paste route. Menu items, prices, and hours render after JavaScript runs.
- Pixieset (photographer galleries): thin shell, must use Rendered DOM Paste route. Gallery thumbnails and collection labels render client-side.
- Squarespace: most blocks render client-side. Use Rendered DOM Paste.
- Wix: pages render from JSON via client-side React. Use Rendered DOM Paste.
- Webflow CMS: collection pages defer their list content to client-side hydration. Use Rendered DOM Paste.
- Shopify Hydrogen / Storefront API themes: product grids hydrate client-side. Use Rendered DOM Paste. Classic Liquid themes are fine without the paste workflow.
- Next.js with
getStaticPropsorgetServerSideProps: full HTML in the server response, the regular audit works fine. - Next.js with
useEffectdata-fetching: thin shell, use Rendered DOM Paste. - WordPress (most themes): full HTML in the server response, regular audit works. Exceptions: WP REST-API-driven pages built with React/Vue widgets.
- Notion / Super.so / Framer: thin shell, use Rendered DOM Paste.
Path B: briefly unblock the probe
This is the path when you need a live fetch — for example, the audit checks /sitemap.xml or /robots.txt separately from the page body, and pasted HTML alone does not solve those checks. Or you run a content-batch audit that fetches multiple URLs.
Discipline: scope the change as narrowly as you can, run the audit, then revert. Do not leave bot protection off after the audit is done. The same rules that block the probe also block credential-stuffing, scraping, and credential-spray attacks.
Also: not every block is in a CDN. WordPress sites often have Wordfence inside the application itself. Some hosts run their own bot defense at the platform layer. The list below covers the ten most common places the rule lives, in order of who you are most likely to be using.
1. Cloudflare
The simplest move: Security → Bots → Configure, toggle Bot Fight Mode to off. Run the audit. Toggle back on.
The surgical move: Security → WAF → Custom Rules → Create rule. Action: Skip. Field: (ip.src eq YOUR.IP.ADDRESS.HERE). Check the boxes for "All remaining custom rules", "Bot Fight Mode", and "Super Bot Fight Mode". Save. Find your IP at whatismyip.com (the audit runs from a Netlify Function so technically you want the Netlify outbound IP range, but for one-off auditing your own IP from your own browser is the easier path — pair it with Path A above).
After auditing, return to the rule, Disable or Delete.
The Cloudflare docs are at WAF Custom Rules → Skip.
For a permanent fix that allowlists verified AI bots without weakening protection, see the AI crawler allowlist guide.
2. Vercel Firewall
Two places to check:
- Project → Settings → Security → Attack Challenge Mode: toggle off, run, toggle on.
- Project → Settings → Security → Firewall: any rule with the Challenge action — temporarily change to Skip for your IP, or disable.
Vercel docs: Attack Challenge Mode.
3. Netlify
Netlify itself does not run a bot-management layer at the edge. If your Netlify site is challenging requests, the rule is somewhere else: Cloudflare or Akamai in front of Netlify, an Edge Function you wrote, or a serverless function with a _guard.mjs rate-limit pattern (this site uses one).
Check Site Settings → Build & Deploy → Edge Functions for installed challenge logic. Look at any netlify.toml [[redirects]] blocks that send / to a challenge page. Look at the function code itself if a function appears to be challenging requests.
4. AWS WAF / CloudFront
Find your Web ACL in the AWS WAF console. Locate any rule with an Action of Challenge or CAPTCHA. Two options:
- Disable the rule temporarily. Easy and reversible.
- Add an action override that skips the rule for your audit IP. Reversible per-deployment (revert the change).
AWS docs: Action overrides.
5. Akamai Bot Manager
The per-IP allow rule is at Bot Manager → Custom Bot Categories → Allow List. Add your audit IP. Run. Remove.
For per-bot allowlisting (the long-term fix), see Configure allowlists.
6. Imperva (formerly Incapsula)
Account Settings → Security → Allow List. Add the IP. Run. Remove.
Docs: Imperva access control.
7. Sucuri WAF (often bundled with GoDaddy Managed WordPress)
If you bought hosting at GoDaddy with the Web Security upgrade, Sucuri sits in front of your site even if you have never opened the Sucuri dashboard.
- The GoDaddy panel: dashboard → Web Security → Managed WAF → Settings, toggle Block Aggressive Web Crawlers off temporarily.
- The Sucuri direct panel: log in at
waf.sucuri.net→ Settings → Access Control → IP Whitelisting. Add your audit IP. Run. Remove.
Docs: Sucuri WAF — Access Control.
8. DataDome
Settings → Web Protection → Exceptions. Add your IP. Action: Allow. Run. Remove.
Docs: DataDome customizations.
9. WordPress + Wordfence (or other security plugins)
Wordfence runs inside WordPress itself, after the request reaches your server. Even if there is no CDN in front, Wordfence can challenge the audit.
- Wordfence → Firewall → All Firewall Options: scroll to Whitelisted IP addresses that bypass all rules and add your audit IP. Save. Run. Remove.
- The blanket toggle is at the same panel: Firewall Status → Disabled. Avoid leaving this off — every minute Wordfence is disabled is a minute brute-force attempts hit your
/wp-login.phpdirectly.
Other WordPress-layer firewalls follow the same shape: WP Cerber Security, MalCare, iThemes Security, NinjaFirewall — each has an IP allowlist or temporary-disable toggle in their Settings page.
10. Fastly Bot Management
Web Application Firewall → Rules. Find any rule with the action Block or Challenge that matches your auditor's traffic. Either disable the rule or add an exception with your IP. Save. Run. Restore.
Docs: Fastly bot management.
What about the host I am using that is not on this list?
The pattern is the same across every CDN and bot-protection vendor:
- Find the panel labeled "Bot Manager", "WAF", "Firewall", "Security", or "Access Control".
- Look for "Allow List", "Whitelist", "Skip Rule", "Custom Rule with Skip action", or "Disable Rule".
- Add your IP, OR a user-agent rule, OR temporarily disable the entire rule group.
- Run the audit.
- Reverse the change.
If you cannot find the right panel, paste the prompt at the bottom of this post into Claude or ChatGPT with your specific vendor name. The assistant should produce the menu path for your dashboard.
After you make a change, verify and revert
The point of the audit is to verify the change, not to assume it. After you allowlist or skip a rule, re-run the audit with no other changes and confirm:
- The WAF banner is no longer shown (the analyzer banner reads green)
- The score updates to reflect actual content (word count above zero, schema types listed, headings populated)
- The AI Crawler Access Auditor no longer flags any bot as "WAF challenge"
Only then is the change verified. If any of those still fail, the rule scope was wrong and you need to widen it.
Then revert your change. Set a calendar reminder for 15 minutes after you flip the toggle. The same rules that blocked the audit also block credential-stuffing, content scraping, and AI training scrapers you might not want.
Why "save my own HTML" is sometimes better than relaxing the rule
A surprising number of audits go faster when you skip the live-fetch path entirely:
- You audit what humans actually see. A logged-in dashboard, a paywalled article, an A/B test variant. The auditor proxy will never get to those. Your browser already does. Save the page after you have the exact state you want scored.
- You can audit a draft before publish. Paste the HTML from the WordPress preview pane, the Eleventy
_site/build, or the Next.js dev-server output. No need to push to production first. - You avoid scoring a challenge page by accident. If you forgot to verify the proxy got real content, you can score a 100% perfect challenge response. Meaningless. Pasting your real HTML removes that ambiguity.
- It costs nothing. No CDN settings, no risk window, no rollback to forget.
The trade-off: tools that fetch sibling resources (/sitemap.xml, /robots.txt, JSON-LD blocks loaded via fetch, image dimensions from byte size) cannot do that when you only paste the page HTML. For those checks, the live-fetch path is required, which is when the temporary CDN relax becomes the right call.
Reference prompt — paste this into your AI assistant
If your specific setup is not in this post, paste this prompt into Claude, ChatGPT, or Gemini. Replace the bracketed values.
You are a senior web-infrastructure engineer. I need to either (a) save my page's
rendered HTML from my browser, or (b) temporarily relax my CDN's bot protection
so a remote audit tool can fetch my page. After the audit, I want to fully restore
the original protection.
My setup:
- Domain: [example.com]
- DNS / CDN provider: [Cloudflare / Vercel / Netlify / GoDaddy / AWS / Akamai /
Imperva / DataDome / Fastly / Wordfence / other]
- Hosting platform: [Netlify / Vercel / WordPress / Shopify / Squarespace /
Wix / Webflow / custom Node / static / other]
- Site framework (if known): [Next.js / Nuxt / SvelteKit / Astro / Eleventy /
raw HTML / Hydrogen / other]
- Browser I have available: [Chrome / Firefox / Safari / Edge]
- Page I want audited: [https://example.com/specific-page]
What I want:
1. If my site is a JavaScript-rendered SPA, the exact DevTools steps to copy the
rendered DOM in my browser.
2. If my site is bot-challenged, the exact menu path in my CDN dashboard to either
(a) allow my audit IP, or (b) temporarily disable the bot challenge.
3. A reversal checklist — how to confirm protection is back to the original
posture after I am done.
4. Anything I should NOT touch (settings that look related but are unrelated and
would weaken protection).
5. If a UI walkthrough is impractical, give me the exact API call (with curl
example) that does the same thing.
Treat this as a working-hours change. I want to audit, verify, and restore within
15 minutes.
The assistant produces a clean step-by-step that matches your dashboard. If the answer is generic, paste a screenshot of your CDN console and ask again — the second turn typically lands on the right setting once it can see the actual UI.
Per-bot probes are a different story
If a per-bot tool (AI Crawler Access Auditor, AI Bot Allowlist Validator) reports "ClaudeBot challenged", that is a real production finding, not a tool failure. The whole point of those tools is to detect bot challenges. Do NOT relax the CDN rule before running them. The fix is to allowlist verified bots in the CDN (guide), not to disable the rule for the audit.
Even when the auditor works, recheck your blocking posture
A clean audit run does not mean your site is fully reachable to AI agents. The audit proxy is one fetcher with one user agent, hitting from one IP range. AI retrievers run dozens of fetchers from different IP ranges with different user agents. Edge cases that escape this proxy can still trip up GPTBot, and vice versa.
Run the AI Crawler Access Auditor and the AI Bot Allowlist Validator on your site quarterly, not just when something breaks. They probe each AI bot's published user agent against your origin separately and tell you which specific bot UAs are getting challenged. That is a different and more diagnostic test than the single-fetch path used by the content auditors.
If anything in those probes says "WAF challenge" for a verified retrieval bot — GPTBot, ClaudeBot, ChatGPT-User, Claude-User, PerplexityBot, Perplexity-User, OAI-SearchBot, Applebot, Bingbot — it is a citation-loss signal, not a security signal. Fix it.
Related reading
- Cloudflare is blocking AI crawlers from your site — long-form CDN allowlist walkthrough
- The 499 problem: Cloudflare's signal that your AI eligibility is collapsing — how WAF and TTFB combine to lock you out of AI search
- Cloudflare Agent Readiness Score — the agent-side signals AI scorers look for
- Markdown for Agents: serving your pages twice — a different fix for the same problem (serve a markdown alternate for bot UAs)
If you are running a sub-$100 SMB stack and need the agent-allowlist + content-paste workflow without buying a third-party scanner, the under-$100 toolchain in The $97 Launch builds the same workflow in-house with the tools on this site.
Fact-check notes and sources
- Cloudflare WAF Custom Rules — Skip action: https://developers.cloudflare.com/waf/custom-rules/skip/
- Cloudflare verified bots: https://developers.cloudflare.com/bots/concepts/bot/#verified-bots
- Vercel Attack Challenge Mode: https://vercel.com/docs/security/attack-challenge-mode
- AWS WAF action overrides: https://docs.aws.amazon.com/waf/latest/developerguide/web-acl-rule-group-override-options.html
- Sucuri WAF access control: https://docs.sucuri.net/website-firewall/settings/access-control/
- Akamai Bot Manager allow lists: https://techdocs.akamai.com/bot-manager/docs/configure-allowlists
- Imperva access control: https://docs.imperva.com/bundle/cloud-application-security/page/access-control.htm
- DataDome customizations: https://docs.datadome.co/docs/customizations
- Fastly Web Application Firewall: https://docs.fastly.com/products/web-application-firewall
- Wordfence Firewall whitelisting: https://www.wordfence.com/help/firewall/options/
- Chrome DevTools — Copy outerHTML: https://developer.chrome.com/docs/devtools/dom/
- Firefox DevTools Inspector: https://firefox-source-docs.mozilla.org/devtools-user/page_inspector/
- Safari Web Inspector: https://webkit.org/web-inspector/
This post is informational, not security or hosting advice. Mentions of Cloudflare, Vercel, Netlify, GoDaddy, AWS, Akamai, Imperva, Sucuri, DataDome, Fastly, Wordfence, Popmenu, Pixieset, Squarespace, Wix, Webflow, Shopify, Next.js, and similar products are nominative fair use. No affiliation is implied.