The llms.txt Validator tells you your file is broken. The LLMs.txt Generator gives you a correct one — built automatically from your sitemap, titles, and meta descriptions.
What the generator does
- Fetches your
sitemap.xmlvia the serverless proxy. - Caps to the first N URLs (default 80; configurable up to 150 for speed).
- Fetches each URL, extracts
<title>and<meta name="description">. - Groups by path-first-segment (
/docs/*→ "Docs",/blog/*→ "Blog",/→ "Core"). - Emits a spec-compliant
/llms.txtwith H1 (your site name), blockquote description, H2 per group, and markdown link rows with title + description. - If a live
/llms.txtexists at your domain, shows a diff: URLs to add, URLs to remove.
Why auto-generation beats hand-writing
Hand-written /llms.txt files drift. You publish new content, forget to update the file, and retrievers see yesterday's catalog. Auto-generation from the sitemap means the file stays in sync with what you're actually publishing — as long as you rerun the generator periodically (monthly is plenty for most sites).
For sites with a build pipeline (Eleventy, Next.js, Astro), the pattern is: run the generator, save the output to src/llms.txt.njk, commit, and ship on next deploy. Automate as a CI step if the site updates daily.
The diff mode
If /llms.txt already exists at your domain, the diff tab shows:
- URLs to add. In sitemap, not in current file. New content you haven't published to llms.txt yet.
- URLs to remove. In current file, not in sitemap. Deleted or deprecated content still referenced.
Copy the generated file over the live one and push. The validator should now pass 12/12 checks.
Grouping choice
The default "group by first path segment" works for most sites. If your IA doesn't match URL paths (e.g. /posts/foo and /posts/bar are actually in different conceptual categories), you'll want to edit the output H2 section names manually before shipping.
Future improvement: read Open Graph og:section or a <meta name="category"> tag if present, to let authors self-categorize without relying on path. Not yet implemented; add it as a template override if you need it.
Related reading
- LLMs.txt Validator — 12 structural checks on the output
- AI Posture Audit — broader discovery-surface audit
- ai.txt Generator — companion training-policy file
- llms.txt structural spec — format reference
Fact-check notes and sources
- llmstxt.org specification: llmstxt.org
- Sitemap protocol: sitemaps.org/protocol.html
- RFC 8615 (Well-Known URIs): datatracker.ietf.org/doc/html/rfc8615
The $100 Network covers llms.txt as a site-network deliverable — one template, per-site fills. The generator is the template; the validator is the gate.