TL;DR. llms-ctx.txt and llms-ctx-full.txt are sibling files to llms.txt. Where llms.txt is a reading list of links, llms-ctx.txt is the same list with each linked markdown page already fetched and wrapped in <doc>…</doc> XML tags, ready to paste directly into an LLM prompt. The -full variant is the same thing without the lite-summary compression. If you run a docs site, an API reference, or a knowledge base, both are high-leverage. If you run a marketing site or a blog, you can ignore them.
Background: what llms.txt is and what's missing
The llms.txt proposal (Jeremy Howard, llmstxt.org) addresses a real problem. LLMs need a curated reading list to make sense of a site, and the navigation chrome of a typical HTML page (header, sidebar, footer, popups) eats space that would be better spent on the actual content. The llms.txt file is a markdown index at the site root listing the canonical pages an LLM should read, ideally pointing at clean .md versions of each.
That's useful but partial. The agent still has to fetch each listed page itself. For a workflow where someone wants to dump everything relevant about a project into a single prompt (a "prime the context" pattern that's common with Claude, ChatGPT, and Gemini), the back-and-forth fetching is friction. That's where the -ctx extensions come in.
What llms-ctx.txt actually contains
llms-ctx.txt takes the link list from llms.txt and inlines the content of each linked page, wrapped in <doc> tags with metadata. A trimmed example:
<doc title="Quick Start" source="https://example.com/docs/quickstart">
# Quick Start
...page body in markdown...
</doc>
<doc title="API Reference" source="https://example.com/docs/api">
# API Reference
...page body in markdown...
</doc>
Drop the whole thing into a prompt and the LLM has the curated corpus in one shot. No fetching, no scraping, no chunking gymnastics. The XML tags give the model unambiguous boundaries between source documents so it can attribute claims back when it summarizes or answers questions.
llms-ctx-full.txt is the same idea without the size-trimming. llms-ctx.txt may exclude optional sections of the source pages (often what llms.txt marks as "Optional"). The -full variant includes everything. Pick whichever one fits the context budget of the model you're targeting.
Who should publish these
The honest answer: a small subset of sites.
Strong fit:
- Documentation sites. The whole point is to let a user prime an LLM with the docs.
- Knowledge bases, especially internal ones republished externally.
- API references where developers paste the spec into an agent.
- Tutorial corpora where each lesson stands on its own as a
<doc>.
Marginal fit:
- Blog corpora. Useful if the blog is a single coherent body of work; less so if it's a grab bag.
- Open-source project sites that wrap docs around code.
Skip:
- Marketing sites and brochure sites. There's no docs body to bundle, and the prompt-injection use case isn't real for visitors.
- E-commerce sites. Product catalogs aren't the right shape for
<doc>bundling. - News sites. Continuously-changing content makes the file stale before it's published.
If you're already publishing a clean llms.txt, generating the -ctx variants is mostly a build-time concatenation step. If you're not yet publishing llms.txt, start there.
How to generate them
The FastHTML reference implementation does it as a build step. The pseudo-code:
- Parse
llms.txt, extracting the list of linked URLs. - For each URL, fetch the markdown version of the page (the same
.mdyou'd serve under content negotiation, or the original markdown source if you publish from a static site generator). - Wrap each in a
<doc title="..." source="...">…</doc>block. - Concatenate them into
llms-ctx.txt. Apply the lite-compression rule (drop "Optional" sections from your llms.txt index). - Concatenate them into
llms-ctx-full.txtwithout the lite-compression. - Serve at
/llms-ctx.txtand/llms-ctx-full.txt.
The Eleventy collection pattern works well for this. So does a build-step script for Astro, Gatsby, Hugo, or any other static generator. Hosting platforms like Netlify, Vercel, and Cloudflare Pages can publish the resulting files without any special configuration.
For sites that already publish llms-full.txt (the alternate, non-FastHTML variant of "all my content concatenated"), llms-ctx-full.txt is mostly a rename plus XML wrappers. If your audience reaches for one or the other, supporting both is cheap.
Size discipline
A common failure mode is shipping a 4 MB llms-ctx.txt that nobody can paste into their model's context window. Two disciplines help.
First, the lite version (no -full suffix) should respect a target token budget. 50,000 tokens is a reasonable ceiling because it fits Claude Opus, GPT-5, and Gemini Pro with room for the user's actual prompt. The "Optional" markers in llms.txt exist exactly to support this trimming.
Second, the -full version can run larger but should still chunk by structure. Aim for 200,000 tokens or less so it fits the longest-context models without forcing chunking. If your corpus is bigger than that, your llms-ctx.txt is probably the wrong shape and you should be publishing topic-scoped variants instead.
How to audit your own files
The Mega SEO Analyzer checks for both /llms-ctx.txt and /llms-ctx-full.txt as part of its agent-readiness aux-file scan. Findings only fire when the site looks like it could plausibly benefit, which means we see llms-full.txt, schema indicating a content site, or article-shaped content already on the page.
For the file content itself, the llms.txt Quality Scorer covers the format checks (well-formed XML tags, no unclosed blocks, document-title presence). The llms.txt Validator covers the base llms.txt spec compliance.
Related reading
- AGENTS.md: The Root-Level README That AI Coding Agents Actually Read
- The Cloudflare Agent Readiness Score and What It Actually Checks
- Markdown for Agents: Serving Your Pages Twice
- The Open Agent Protocol Stack
- llms-full.txt Coverage Audit: When the All-Pages Bundle Goes Wrong
If you're shipping a docs-first product on a sub-$100 stack and want the rest of the agent-readable surface (llms.txt, well-known files, schema), the same playbook fits inside The $97 Launch without any vendor lock-in.
Fact-check notes and sources
- Jeremy Howard / Answer.AI: llmstxt.org spec
- FastHTML: llms.txt extensions reference implementation
- Anthropic: Claude prompt-engineering guide on XML tags
- OpenAI: GPT-5 context-window documentation
This post is informational. The FastHTML extension is a community proposal layered on top of Jeremy Howard's llms.txt spec, not a Google or OpenAI standard. Adoption is uneven; check the spec page for current status before building infrastructure around it.