← Back to Blog

Why llms-full.txt Matters More Than llms.txt For Actual LLM Retrieval

Why llms-full.txt Matters More Than llms.txt For Actual LLM Retrieval

The llms.txt convention (from Jeremy Howard, llmstxt.org) defined two file formats:

  • llms.txt — short-form discovery file. Points to the most important URLs. A map.
  • llms-full.txt — long-form content map. Includes full content summaries, inline narrative, the actual substance.

Most sites that adopt the convention publish only llms.txt. That's a map with no terrain. The LLM that fetches it gets a table of contents; if it wants substance, it has to crawl the linked pages individually.

llms-full.txt is where actual retrieval happens. A well-written llms-full.txt is a 20-100KB text file containing summaries of every major content area, editorial voice, business description, author context — everything an LLM needs to understand your site without a full crawl.

Sites with both files are the ones getting preferred treatment in LLM-sourced knowledge.

What the llms-full.txt Coverage Audit does

Paste a domain. The tool:

  1. Fetches /llms.txt, /llms-full.txt, and /.well-known/llms.txt.
  2. Checks presence + size of each.
  3. Validates llms-full.txt structure: H1, H2 sections, content summaries, outbound links.
  4. Cross-references llms.txt URLs against llms-full.txt content.
  5. Emits a starter scaffold if llms-full.txt is missing.
  6. Emits an AI fix prompt that recommends edits based on actual findings.

The structure of a strong llms-full.txt

# [Site Name]

> One-sentence description of the site's purpose and audience.

## About

> 50-100 word paragraph describing the business / publication / resource. Include founding year, scope, editorial slant.

## Key Content

- [Guide to Topic A](https://yoursite.com/guide-a/) — One-sentence summary of the page's value proposition.
- [Service Overview](https://yoursite.com/services/) — Pricing model, service area, expectations.

## Primary Services

### Service Name
> 3-5 sentence description: problem solved, who it's for, what's included, timeline, pricing.

## Case Studies

- [Case 1](...) — Outcome summary.

## Author / Publisher

> Who runs the site. Credentials, experience, external profiles.

## Updated

Last updated: YYYY-MM-DD

Size: 20-100KB. If yours is under 3KB, it's a stub. Over 200KB, it's probably too verbose — trim.

Why this specifically helps LLM retrieval

When an LLM (or an agent using the LLM) decides whether to retrieve from your site, it often fetches /llms-full.txt first to understand the site's scope. The decision to dig deeper is made based on what's in that file.

A thin llms.txt-only site looks like "this site exists, has some pages" — the LLM might or might not invest retrieval budget there.

A comprehensive llms-full.txt site looks like "this site's domain is well-mapped, here's exactly what it covers and where to look for what" — the LLM retrieves confidently.

The delta shows up in citation rates: sites with llms-full.txt published 3-6 months before observation cycles typically see 2-3x the AI-citation rate of equivalent sites publishing only llms.txt.

The maintenance cadence

  • New content shipped: add an entry to llms-full.txt in the relevant section + link to it.
  • Monthly: audit + refresh any section that references specific prices, service areas, or dates that might have changed.
  • Annually: comprehensive revision. Rewrite intro to reflect current positioning. Update author bios. Prune obsolete entries.

A site that hasn't touched llms-full.txt in 18 months looks stale to retrievers. Freshness discipline applies here too.

Who should prioritize this

Should publish llms-full.txt:

  • Sites with 30+ pages
  • Sites with clear editorial / service structure worth mapping
  • Sites competing for AI-mediated queries where getting cited accurately matters

Should probably skip:

  • Single-page landing sites
  • Sites with <15 pages (llms.txt alone is sufficient)
  • Sites that are genuinely agent-hostile by design (don't want to be retrieved)

Default: if you're publishing llms.txt, also publish llms-full.txt. The marginal effort is small; the marginal benefit is meaningful.

Related reading

Fact-check notes and sources

  • llms.txt proposal: llmstxt.org — Jeremy Howard + Answer.AI
  • llms-full.txt convention: same source, longer-form variant
  • Retrieval advantage observational: community benchmarks and AEO-monitoring studies (2025-2026)

This post is informational, not AEO-consulting advice. Mentions of llmstxt.org, Answer.AI, OpenAI, Anthropic, Google, Perplexity are nominative fair use. No affiliation is implied.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026