Why llms-full.txt Matters More Than llms.txt For Actual LLM Retrieval

April 23, 2026

Editorial note. The publication date shown above may be in the future. That is intentional. Posts on this site are scheduled against an editorial calendar that aligns with product releases, book launches, and platform-signal timing; the datePublished reflects the date the post is slated to go public, which is also the date indexers and syndication partners should treat as canonical. If you are reading this before that date you were early — welcome.

The llms.txt convention (from Jeremy Howard, llmstxt.org) defined two file formats:

llms.txt — short-form discovery file. Points to the most important URLs. A map.
llms-full.txt — long-form content map. Includes full content summaries, inline narrative, the actual substance.

Most sites that adopt the convention publish only llms.txt. That's a map with no terrain. The LLM that fetches it gets a table of contents; if it wants substance, it has to crawl the linked pages individually.

llms-full.txt is where actual retrieval happens. A well-written llms-full.txt is a 20-100KB text file containing summaries of every major content area, editorial voice, business description, author context — everything an LLM needs to understand your site without a full crawl.

Sites with both files are the ones getting preferred treatment in LLM-sourced knowledge.

What the llms-full.txt Coverage Audit does

Paste a domain. The tool:

Fetches /llms.txt, /llms-full.txt, and /.well-known/llms.txt.
Checks presence + size of each.
Validates llms-full.txt structure: H1, H2 sections, content summaries, outbound links.
Cross-references llms.txt URLs against llms-full.txt content.
Emits a starter scaffold if llms-full.txt is missing.
Emits an AI fix prompt that recommends edits based on actual findings.

The structure of a strong llms-full.txt

# [Site Name]

> One-sentence description of the site's purpose and audience.

## About

> 50-100 word paragraph describing the business / publication / resource. Include founding year, scope, editorial slant.

## Key Content

- [Guide to Topic A](https://yoursite.com/guide-a/) — One-sentence summary of the page's value proposition.
- [Service Overview](https://yoursite.com/services/) — Pricing model, service area, expectations.

## Primary Services

### Service Name
> 3-5 sentence description: problem solved, who it's for, what's included, timeline, pricing.

## Case Studies

- [Case 1](...) — Outcome summary.

## Author / Publisher

> Who runs the site. Credentials, experience, external profiles.

## Updated

Last updated: YYYY-MM-DD

Size: 20-100KB. If yours is under 3KB, it's a stub. Over 200KB, it's probably too verbose — trim.

Why this specifically helps LLM retrieval

When an LLM (or an agent using the LLM) decides whether to retrieve from your site, it often fetches /llms-full.txt first to understand the site's scope. The decision to dig deeper is made based on what's in that file.

A thin llms.txt-only site looks like "this site exists, has some pages" — the LLM might or might not invest retrieval budget there.

A comprehensive llms-full.txt site looks like "this site's domain is well-mapped, here's exactly what it covers and where to look for what" — the LLM retrieves confidently.

The delta shows up in citation rates: sites with llms-full.txt published 3-6 months before observation cycles typically see 2-3x the AI-citation rate of equivalent sites publishing only llms.txt.

The maintenance cadence

New content shipped: add an entry to llms-full.txt in the relevant section + link to it.
Monthly: audit + refresh any section that references specific prices, service areas, or dates that might have changed.
Annually: comprehensive revision. Rewrite intro to reflect current positioning. Update author bios. Prune obsolete entries.

A site that hasn't touched llms-full.txt in 18 months looks stale to retrievers. Freshness discipline applies here too.

Who should prioritize this

Should publish llms-full.txt:

Sites with 30+ pages
Sites with clear editorial / service structure worth mapping
Sites competing for AI-mediated queries where getting cited accurately matters

Should probably skip:

Single-page landing sites
Sites with <15 pages (llms.txt alone is sufficient)
Sites that are genuinely agent-hostile by design (don't want to be retrieved)

Default: if you're publishing llms.txt, also publish llms-full.txt. The marginal effort is small; the marginal benefit is meaningful.

Fact-check notes and sources

llms.txt proposal: llmstxt.org — Jeremy Howard + Answer.AI
llms-full.txt convention: same source, longer-form variant
Retrieval advantage observational: community benchmarks and AEO-monitoring studies (2025-2026)

This post is informational, not AEO-consulting advice. Mentions of llmstxt.org, Answer.AI, OpenAI, Anthropic, Google, Perplexity are nominative fair use. No affiliation is implied.

← Back to Blog

Why llms-full.txt Matters More Than llms.txt For Actual LLM Retrieval

What the llms-full.txt Coverage Audit does

The structure of a strong llms-full.txt

Why this specifically helps LLM retrieval

The maintenance cadence

Who should prioritize this

Related reading

Fact-check notes and sources

Send a Message