llms-ctx.txt and llms-ctx-full.txt: The FastHTML Extens...

TL;DR. llms-ctx.txt and llms-ctx-full.txt are sibling files to llms.txt. Where llms.txt is a reading list of links, llms-ctx.txt is the same list with each linked markdown page already fetched and wrapped in <doc>…</doc> XML tags, ready to paste directly into an LLM prompt. The -full variant is the same thing without the lite-summary compression. If you run a docs site, an API reference, or a knowledge base, both are high-leverage. If you run a marketing site or a blog, you can ignore them.

Background: what llms.txt is and what's missing

The llms.txt proposal (Jeremy Howard, llmstxt.org) addresses a real problem. LLMs need a curated reading list to make sense of a site, and the navigation chrome of a typical HTML page (header, sidebar, footer, popups) eats space that would be better spent on the actual content. The llms.txt file is a markdown index at the site root listing the canonical pages an LLM should read, ideally pointing at clean .md versions of each.

That's useful but partial. The agent still has to fetch each listed page itself. For a workflow where someone wants to dump everything relevant about a project into a single prompt (a "prime the context" pattern that's common with Claude, ChatGPT, and Gemini), the back-and-forth fetching is friction. That's where the -ctx extensions come in.

What llms-ctx.txt actually contains

llms-ctx.txt takes the link list from llms.txt and inlines the content of each linked page, wrapped in <doc> tags with metadata. A trimmed example:

<doc title="Quick Start" source="https://example.com/docs/quickstart">
# Quick Start
...page body in markdown...
</doc>

<doc title="API Reference" source="https://example.com/docs/api">
# API Reference
...page body in markdown...
</doc>

Drop the whole thing into a prompt and the LLM has the curated corpus in one shot. No fetching, no scraping, no chunking gymnastics. The XML tags give the model unambiguous boundaries between source documents so it can attribute claims back when it summarizes or answers questions.

llms-ctx-full.txt is the same idea without the size-trimming. llms-ctx.txt may exclude optional sections of the source pages (often what llms.txt marks as "Optional"). The -full variant includes everything. Pick whichever one fits the context budget of the model you're targeting.

Who should publish these

The honest answer: a small subset of sites.

Strong fit:

Documentation sites. The whole point is to let a user prime an LLM with the docs.
Knowledge bases, especially internal ones republished externally.
API references where developers paste the spec into an agent.
Tutorial corpora where each lesson stands on its own as a <doc>.

Marginal fit:

Blog corpora. Useful if the blog is a single coherent body of work; less so if it's a grab bag.
Open-source project sites that wrap docs around code.

Skip:

Marketing sites and brochure sites. There's no docs body to bundle, and the prompt-injection use case isn't real for visitors.
E-commerce sites. Product catalogs aren't the right shape for <doc> bundling.
News sites. Continuously-changing content makes the file stale before it's published.

If you're already publishing a clean llms.txt, generating the -ctx variants is mostly a build-time concatenation step. If you're not yet publishing llms.txt, start there.

How to generate them

The FastHTML reference implementation does it as a build step. The pseudo-code:

Parse llms.txt, extracting the list of linked URLs.
For each URL, fetch the markdown version of the page (the same .md you'd serve under content negotiation, or the original markdown source if you publish from a static site generator).
Wrap each in a <doc title="..." source="...">…</doc> block.
Concatenate them into llms-ctx.txt. Apply the lite-compression rule (drop "Optional" sections from your llms.txt index).
Concatenate them into llms-ctx-full.txt without the lite-compression.
Serve at /llms-ctx.txt and /llms-ctx-full.txt.

The Eleventy collection pattern works well for this. So does a build-step script for Astro, Gatsby, Hugo, or any other static generator. Hosting platforms like Netlify, Vercel, and Cloudflare Pages can publish the resulting files without any special configuration.

For sites that already publish llms-full.txt (the alternate, non-FastHTML variant of "all my content concatenated"), llms-ctx-full.txt is mostly a rename plus XML wrappers. If your audience reaches for one or the other, supporting both is cheap.

Size discipline

A common failure mode is shipping a 4 MB llms-ctx.txt that nobody can paste into their model's context window. Two disciplines help.

First, the lite version (no -full suffix) should respect a target token budget. 50,000 tokens is a reasonable ceiling because it fits Claude Opus, GPT-5, and Gemini Pro with room for the user's actual prompt. The "Optional" markers in llms.txt exist exactly to support this trimming.

Second, the -full version can run larger but should still chunk by structure. Aim for 200,000 tokens or less so it fits the longest-context models without forcing chunking. If your corpus is bigger than that, your llms-ctx.txt is probably the wrong shape and you should be publishing topic-scoped variants instead.

How to audit your own files

The Mega SEO Analyzer checks for both /llms-ctx.txt and /llms-ctx-full.txt as part of its agent-readiness aux-file scan. Findings only fire when the site looks like it could plausibly benefit, which means we see llms-full.txt, schema indicating a content site, or article-shaped content already on the page.

For the file content itself, the llms.txt Quality Scorer covers the format checks (well-formed XML tags, no unclosed blocks, document-title presence). The llms.txt Validator covers the base llms.txt spec compliance.

Fact-check notes and sources

Jeremy Howard / Answer.AI: llmstxt.org spec
FastHTML: llms.txt extensions reference implementation
Anthropic: Claude prompt-engineering guide on XML tags
OpenAI: GPT-5 context-window documentation

This post is informational. The FastHTML extension is a community proposal layered on top of Jeremy Howard's llms.txt spec, not a Google or OpenAI standard. Adoption is uneven; check the spec page for current status before building infrastructure around it.

llms-ctx.txt and llms-ctx-full.txt: The FastHTML Extensions to llms.txt