# llms-ctx.txt and llms-ctx-full.txt: The FastHTML Extensions to llms.txt

The pair of files that take llms.txt one step further by pre-wrapping linked pages in XML tags ready for one-shot LLM prompt injection. Worth it for docs and KB sites, skippable for the rest.

Author: J.A. Watte
Published: May 10, 2026
Source: https://jwatte.com/blog/blog-llms-ctx-fasthtml-extension/

---

**TL;DR.** `llms-ctx.txt` and `llms-ctx-full.txt` are sibling files to `llms.txt`. Where `llms.txt` is a reading list of links, `llms-ctx.txt` is the same list with each linked markdown page already fetched and wrapped in `<doc>…</doc>` XML tags, ready to paste directly into an LLM prompt. The `-full` variant is the same thing without the lite-summary compression. If you run a docs site, an API reference, or a knowledge base, both are high-leverage. If you run a marketing site or a blog, you can ignore them.

## Background: what llms.txt is and what's missing

The `llms.txt` proposal (Jeremy Howard, [llmstxt.org](https://llmstxt.org/)) addresses a real problem. LLMs need a curated reading list to make sense of a site, and the navigation chrome of a typical HTML page (header, sidebar, footer, popups) eats space that would be better spent on the actual content. The `llms.txt` file is a markdown index at the site root listing the canonical pages an LLM should read, ideally pointing at clean `.md` versions of each.

That's useful but partial. The agent still has to fetch each listed page itself. For a workflow where someone wants to dump *everything* relevant about a project into a single prompt (a "prime the context" pattern that's common with Claude, ChatGPT, and Gemini), the back-and-forth fetching is friction. That's where the `-ctx` extensions come in.

## What llms-ctx.txt actually contains

`llms-ctx.txt` takes the link list from `llms.txt` and inlines the content of each linked page, wrapped in `<doc>` tags with metadata. A trimmed example:

```
<doc title="Quick Start" source="https://example.com/docs/quickstart">
# Quick Start
...page body in markdown...
</doc>

<doc title="API Reference" source="https://example.com/docs/api">
# API Reference
...page body in markdown...
</doc>
```

Drop the whole thing into a prompt and the LLM has the curated corpus in one shot. No fetching, no scraping, no chunking gymnastics. The XML tags give the model unambiguous boundaries between source documents so it can attribute claims back when it summarizes or answers questions.

`llms-ctx-full.txt` is the same idea without the size-trimming. `llms-ctx.txt` may exclude optional sections of the source pages (often what `llms.txt` marks as "Optional"). The `-full` variant includes everything. Pick whichever one fits the context budget of the model you're targeting.

## Who should publish these

The honest answer: a small subset of sites.

**Strong fit:**
- Documentation sites. The whole point is to let a user prime an LLM with the docs.
- Knowledge bases, especially internal ones republished externally.
- API references where developers paste the spec into an agent.
- Tutorial corpora where each lesson stands on its own as a `<doc>`.

**Marginal fit:**
- Blog corpora. Useful if the blog is a single coherent body of work; less so if it's a grab bag.
- Open-source project sites that wrap docs around code.

**Skip:**
- Marketing sites and brochure sites. There's no docs body to bundle, and the prompt-injection use case isn't real for visitors.
- E-commerce sites. Product catalogs aren't the right shape for `<doc>` bundling.
- News sites. Continuously-changing content makes the file stale before it's published.

If you're already publishing a clean `llms.txt`, generating the `-ctx` variants is mostly a build-time concatenation step. If you're not yet publishing `llms.txt`, start there.

## How to generate them

The FastHTML reference implementation does it as a build step. The pseudo-code:

1. Parse `llms.txt`, extracting the list of linked URLs.
2. For each URL, fetch the markdown version of the page (the same `.md` you'd serve under content negotiation, or the original markdown source if you publish from a static site generator).
3. Wrap each in a `<doc title="..." source="...">…</doc>` block.
4. Concatenate them into `llms-ctx.txt`. Apply the lite-compression rule (drop "Optional" sections from your llms.txt index).
5. Concatenate them into `llms-ctx-full.txt` without the lite-compression.
6. Serve at `/llms-ctx.txt` and `/llms-ctx-full.txt`.

The Eleventy collection pattern works well for this. So does a build-step script for Astro, Gatsby, Hugo, or any other static generator. Hosting platforms like Netlify, Vercel, and Cloudflare Pages can publish the resulting files without any special configuration.

For sites that already publish `llms-full.txt` (the alternate, non-FastHTML variant of "all my content concatenated"), `llms-ctx-full.txt` is mostly a rename plus XML wrappers. If your audience reaches for one or the other, supporting both is cheap.

## Size discipline

A common failure mode is shipping a 4 MB `llms-ctx.txt` that nobody can paste into their model's context window. Two disciplines help.

First, the lite version (no `-full` suffix) should respect a target token budget. 50,000 tokens is a reasonable ceiling because it fits Claude Opus, GPT-5, and Gemini Pro with room for the user's actual prompt. The "Optional" markers in `llms.txt` exist exactly to support this trimming.

Second, the `-full` version can run larger but should still chunk by structure. Aim for 200,000 tokens or less so it fits the longest-context models without forcing chunking. If your corpus is bigger than that, your `llms-ctx.txt` is probably the wrong shape and you should be publishing topic-scoped variants instead.

## How to audit your own files

The [Mega SEO Analyzer](/tools/mega-seo-analyzer/) checks for both `/llms-ctx.txt` and `/llms-ctx-full.txt` as part of its agent-readiness aux-file scan. Findings only fire when the site looks like it could plausibly benefit, which means we see `llms-full.txt`, schema indicating a content site, or article-shaped content already on the page.

For the file content itself, the [llms.txt Quality Scorer](/tools/llms-txt-quality-scorer/) covers the format checks (well-formed XML tags, no unclosed blocks, document-title presence). The [llms.txt Validator](/tools/llms-txt-validator/) covers the base `llms.txt` spec compliance.

## Related reading

- [AGENTS.md: The Root-Level README That AI Coding Agents Actually Read](/blog/blog-agents-md-root-spec/)
- [The Cloudflare Agent Readiness Score and What It Actually Checks](/blog/blog-cloudflare-agent-readiness-score/)
- [Markdown for Agents: Serving Your Pages Twice](/blog/blog-fix-markdown-for-agents-warning/)
- [The Open Agent Protocol Stack](/blog/blog-agent-protocol-stack/)
- [llms-full.txt Coverage Audit: When the All-Pages Bundle Goes Wrong](/blog/blog-tool-llms-full-txt-coverage-audit/)

If you're shipping a docs-first product on a sub-$100 stack and want the rest of the agent-readable surface (llms.txt, well-known files, schema), the same playbook fits inside [The $97 Launch](https://www.amazon.com/dp/B0FXJBSGGC) without any vendor lock-in.

## Fact-check notes and sources

- Jeremy Howard / Answer.AI: [llmstxt.org spec](https://llmstxt.org/)
- FastHTML: [llms.txt extensions reference implementation](https://github.com/AnswerDotAI/fasthtml/tree/main/fasthtml/llmsfile)
- Anthropic: [Claude prompt-engineering guide on XML tags](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags)
- OpenAI: [GPT-5 context-window documentation](https://platform.openai.com/docs/models)

*This post is informational. The FastHTML extension is a community proposal layered on top of Jeremy Howard's llms.txt spec, not a Google or OpenAI standard. Adoption is uneven; check the spec page for current status before building infrastructure around it.*


---

Canonical HTML: https://jwatte.com/blog/blog-llms-ctx-fasthtml-extension/
RSS: https://jwatte.com/feed.xml
JSON Feed: https://jwatte.com/feed.json
Hero image: https://jwatte.com/images/blog-llms-ctx-fasthtml-extension.webp
