The AI Content Disclosure Audit is the audit you reach for when you already suspect a problem in this dimension and need a fast, copy-paste-able fix list. It reuses the same chrome as every other jwatte.com tool — deep-links from the mega analyzers, AI-prompt export, CSV/PDF/HTML download — but the checks it runs are narrow and specific.
Checks for AI-generated content disclosure: visible
Why this dimension matters
AI search runs in two stages: DISCOVERY (the LLM queries a classic search engine to get ~20 candidate URLs) and RETRIEVAL (it fetches those pages, chunks them into ~150-token passages, and cites whichever chunk best matches the query). Classic SEO buys the seat; paragraph-level structure buys the citation. AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) do NOT execute JavaScript — every critical claim must be in the server-rendered HTML.
Common failure patterns
- SPA shell with empty
<div id="root">— React / Vue / Angular apps that hydrate on the client look completely empty to AI crawlers. The fix is SSR (Next.jsgetServerSideProps, NuxtasyncData, Svelte Kit load) or prerender / static export for content-heavy pages. - Missing
llms.txtat the site root — the emerging standard for pointing AI crawlers at your canonical content. Absence is not catastrophic but presence makes your site noticeably easier to retrieve. Pair withllms-full.txtfor full-content mirroring. - AI crawler-blocking in robots.txt without strategy — blocking GPTBot while allowing Googlebot is a choice; blocking all AI crawlers by default without knowing whether your audience queries ChatGPT / Claude / Perplexity is a cost. Decide deliberately; most content businesses benefit from allowing retrieval crawlers while blocking training crawlers.
- Paragraphs over 300 words — each
<p>is a retrieval unit for the chunker. Target 40–150 words per paragraph. Thinner = no answer match; thicker = split mid-thought and lose coherence at citation time.
How to fix it at the source
Start with llms.txt + llms-full.txt at the site root. Audit your robots.txt stance per bot deliberately. Restructure long paragraphs into 40–150-word chunks that each contain a complete claim + evidence pair. Track LLM referral visits via a custom Referrer segment (chatgpt.com, perplexity.ai, claude.ai, gemini.google.com, copilot.microsoft.com) — that is the canonical AEO KPI.
When to run the audit
- After a major site change — redesign, CMS migration, DNS change, hosting platform swap.
- Quarterly as part of routine technical hygiene; the checks are cheap to run repeatedly.
- Before an investor / client review, a PCI scan, a SOC 2 audit, or an accessibility-compliance review.
- When a downstream metric drops (rankings, conversion, AI citations) and you need to rule out this dimension as the cause.
Reading the output
Every finding is severity-classified. The playbook is the same across tools:
- Critical / red: same-week fixes. These block the primary signal and cascade into downstream dimensions.
- Warning / amber: same-month fixes. Drag the score, usually don't block.
- Info / blue: context-only. Often what a PR reviewer would flag but that doesn't block merge.
- Pass / green: confirmation — keep the control in place.
Every audit also emits an "AI fix prompt" — paste into ChatGPT / Claude / Gemini for exact copy-paste code patches tied to your stack.
Related tools
- Mega AEO Analyzer — One URL, 10 AEO probes in one pass: schema, attribution, retrievability, freshness, accessibility, tokenizer, prompt-injection, AI-bot meta, speakable, E-E-A-T.
- AI Posture Audit — Cross-references robots.txt, ai.txt, meta robots, and X-Robots-Tag per AI bot — flags disagreements that cause unpredictable crawl behavior..
- llms.txt Quality Scorer — Fetches /llms.txt, /.well-known/llms.txt, /llms-full.txt.
- AI Crawler Access Auditor — Fetches robots.txt, ai.txt, llms.txt, meta robots, X-Robots-Tag.
- RAG Readiness Audit — 10-check score: SSR content, canonical, heading hierarchy, passage-friendly paragraphs, sentence-complete alt, schema type, freshness, script density, robots, clean canonical.
Fact-check notes and sources
- llmstxt.org: llms.txt proposed standard
- OpenAI: GPTBot documentation
- Anthropic: ClaudeBot documentation
- Perplexity: PerplexityBot
- Google: Google-Extended opt-out
This post is informational and not a substitute for professional consulting. Mentions of third-party platforms in the tool itself are nominative fair use. No affiliation is implied.