# RAG Splits Your Page Into Chunks — How To Make Each Chunk Retrievable

Retrieval-Augmented Generation splits your page into fixed-token chunks, embeds each, and retrieves the best-matching chunk when a user asks a question. If your content only makes sense when read in order, RAG fails. The scorer splits your URL three ways and grades each chunk.

Author: J.A. Watte
Published: April 30, 2026
Source: https://jwatte.com/blog/blog-tool-chunk-retrievability/

---

_Part of the [AEO / GEO / AI-search audit tool stack](/blog/blog-new-aeo-audit-tools-2026/).  See the pillar post for the full catalog of sibling audits and where this one fits in the lineup._

A RAG retriever doesn't read your page. It reads one ~500-token chunk of your page — the chunk whose embedding most closely matches the user's question. If that chunk starts with "This is because…" without naming what "this" refers to, the LLM has no context. Generated answer = incoherent or fabricated.

The [Chunk Retrievability Scorer](/tools/chunk-retrievability/) fetches a URL, splits content into 256 / 512 / 1024-token chunks (the three sizes most RAG pipelines use), and scores each chunk on five signals.

## The five chunk signals

1. **Direct-answer opener.** First sentence starts with a concrete noun phrase, not a pronoun or transition word.
2. **Low pronoun density.** < 6% of tokens are pronouns. Higher = unresolved antecedents when chunk is extracted.
3. **Fact-dense.** 3+ combined numeric tokens or proper nouns. Facts are what LLMs prefer to quote.
4. **Self-contained.** Has a named subject; isn't purely "As we discussed…" scaffolding.
5. **Target size.** Between 60% and 120% of the target chunk size (i.e., 300-615 words for 512-token target).

The first 4 signals carry over from [Passage Retrievability](/tools/passage-retrievability/). Chunk-level adds the size signal because RAG pipelines pick a chunk size upfront and every chunk needs to fit inside it.

## Why three sizes

Different RAG frameworks use different defaults:

- **256 tokens** — very fine-grained. Used when the retriever needs to surgically pick one sentence or one fact. Scoring 256-token chunks catches chunks that are full of unresolved pronouns even at small scope.
- **512 tokens** — standard LangChain / LlamaIndex default. Most real-world RAG runs at this size.
- **1024 tokens** — used when context matters more than precision. Chunks here are less likely to fail self-containment.

If your 256-chunks score 40% average but 1024-chunks score 70%, the page has strong topic coherence but weak sentence-level self-containment. Rewrite openers; the average will lift across all sizes.

## The fix workflow

1. Run the scorer. Toggle to 512-token view (RAG default).
2. Identify weak chunks (< 50%).
3. Copy the AI fix prompt. Paste into Claude.
4. Get rewrites — opener-fixed sentences you can search-and-replace into the page.
5. Rerun the scorer. Average should lift 10-20 percentage points per pass.

## Related reading

- [Passage Retrievability](/tools/passage-retrievability/) — paragraph-level scoring
- [AI Citation Readiness](/tools/ai-citation-readiness/) — page-level scoring
- [Speakable Generator](/tools/speakable-generator/) — mark the best chunks for voice
- [FAQ Harvester](/tools/faq-harvester/) — question-answer chunks are naturally retrievable

## Fact-check notes and sources

- LangChain chunk-splitting docs: [python.langchain.com/docs/modules/data_connection/document_transformers](https://python.langchain.com/)
- LlamaIndex node parsers: [docs.llamaindex.ai](https://docs.llamaindex.ai/)
- Embedding models & chunk size research: [arxiv.org/abs/2311.08377](https://arxiv.org/abs/2311.08377)

---

*The $100 Network covers writing content that survives RAG chunking — direct openers, fact-density, no cross-paragraph references. The scorer is the verification pass.*


---

Canonical HTML: https://jwatte.com/blog/blog-tool-chunk-retrievability/
RSS: https://jwatte.com/feed.xml
JSON Feed: https://jwatte.com/feed.json
Hero image: https://jwatte.com/images/blog-tool-chunk-retrievability.webp
