The llms.txt convention (from Jeremy Howard, llmstxt.org) defined two file formats:
- llms.txt — short-form discovery file. Points to the most important URLs. A map.
- llms-full.txt — long-form content map. Includes full content summaries, inline narrative, the actual substance.
Most sites that adopt the convention publish only llms.txt. That's a map with no terrain. The LLM that fetches it gets a table of contents; if it wants substance, it has to crawl the linked pages individually.
llms-full.txt is where actual retrieval happens. A well-written llms-full.txt is a 20-100KB text file containing summaries of every major content area, editorial voice, business description, author context — everything an LLM needs to understand your site without a full crawl.
Sites with both files are the ones getting preferred treatment in LLM-sourced knowledge.
What the llms-full.txt Coverage Audit does
Paste a domain. The tool:
- Fetches
/llms.txt,/llms-full.txt, and/.well-known/llms.txt. - Checks presence + size of each.
- Validates llms-full.txt structure: H1, H2 sections, content summaries, outbound links.
- Cross-references llms.txt URLs against llms-full.txt content.
- Emits a starter scaffold if llms-full.txt is missing.
- Emits an AI fix prompt that recommends edits based on actual findings.
The structure of a strong llms-full.txt
# [Site Name]
> One-sentence description of the site's purpose and audience.
## About
> 50-100 word paragraph describing the business / publication / resource. Include founding year, scope, editorial slant.
## Key Content
- [Guide to Topic A](https://yoursite.com/guide-a/) — One-sentence summary of the page's value proposition.
- [Service Overview](https://yoursite.com/services/) — Pricing model, service area, expectations.
## Primary Services
### Service Name
> 3-5 sentence description: problem solved, who it's for, what's included, timeline, pricing.
## Case Studies
- [Case 1](...) — Outcome summary.
## Author / Publisher
> Who runs the site. Credentials, experience, external profiles.
## Updated
Last updated: YYYY-MM-DD
Size: 20-100KB. If yours is under 3KB, it's a stub. Over 200KB, it's probably too verbose — trim.
Why this specifically helps LLM retrieval
When an LLM (or an agent using the LLM) decides whether to retrieve from your site, it often fetches /llms-full.txt first to understand the site's scope. The decision to dig deeper is made based on what's in that file.
A thin llms.txt-only site looks like "this site exists, has some pages" — the LLM might or might not invest retrieval budget there.
A comprehensive llms-full.txt site looks like "this site's domain is well-mapped, here's exactly what it covers and where to look for what" — the LLM retrieves confidently.
The delta shows up in citation rates: sites with llms-full.txt published 3-6 months before observation cycles typically see 2-3x the AI-citation rate of equivalent sites publishing only llms.txt.
The maintenance cadence
- New content shipped: add an entry to llms-full.txt in the relevant section + link to it.
- Monthly: audit + refresh any section that references specific prices, service areas, or dates that might have changed.
- Annually: comprehensive revision. Rewrite intro to reflect current positioning. Update author bios. Prune obsolete entries.
A site that hasn't touched llms-full.txt in 18 months looks stale to retrievers. Freshness discipline applies here too.
Who should prioritize this
Should publish llms-full.txt:
- Sites with 30+ pages
- Sites with clear editorial / service structure worth mapping
- Sites competing for AI-mediated queries where getting cited accurately matters
Should probably skip:
- Single-page landing sites
- Sites with <15 pages (llms.txt alone is sufficient)
- Sites that are genuinely agent-hostile by design (don't want to be retrieved)
Default: if you're publishing llms.txt, also publish llms-full.txt. The marginal effort is small; the marginal benefit is meaningful.
Related reading
- llms.txt Validator — short-form companion
- llms.txt Quality Scorer — quality rubric
- llms.txt Generator — generate the short-form file
- RAG Readiness Audit — adjacent retrieval-readiness
Fact-check notes and sources
- llms.txt proposal: llmstxt.org — Jeremy Howard + Answer.AI
- llms-full.txt convention: same source, longer-form variant
- Retrieval advantage observational: community benchmarks and AEO-monitoring studies (2025-2026)
This post is informational, not AEO-consulting advice. Mentions of llmstxt.org, Answer.AI, OpenAI, Anthropic, Google, Perplexity are nominative fair use. No affiliation is implied.