Chunk Retrievability Scorer — 256 / 512 / 1024-Token Ch...

Part of the AEO / GEO / AI-search audit tool stack. See the pillar post for the full catalog of sibling audits and where this one fits in the lineup.

A RAG retriever doesn't read your page. It reads one ~500-token chunk of your page — the chunk whose embedding most closely matches the user's question. If that chunk starts with "This is because…" without naming what "this" refers to, the LLM has no context. Generated answer = incoherent or fabricated.

The Chunk Retrievability Scorer fetches a URL, splits content into 256 / 512 / 1024-token chunks (the three sizes most RAG pipelines use), and scores each chunk on five signals.

The five chunk signals

Direct-answer opener. First sentence starts with a concrete noun phrase, not a pronoun or transition word.
Low pronoun density. < 6% of tokens are pronouns. Higher = unresolved antecedents when chunk is extracted.
Fact-dense. 3+ combined numeric tokens or proper nouns. Facts are what LLMs prefer to quote.
Self-contained. Has a named subject; isn't purely "As we discussed…" scaffolding.
Target size. Between 60% and 120% of the target chunk size (i.e., 300-615 words for 512-token target).

The first 4 signals carry over from Passage Retrievability. Chunk-level adds the size signal because RAG pipelines pick a chunk size upfront and every chunk needs to fit inside it.

Why three sizes

Different RAG frameworks use different defaults:

256 tokens — very fine-grained. Used when the retriever needs to surgically pick one sentence or one fact. Scoring 256-token chunks catches chunks that are full of unresolved pronouns even at small scope.
512 tokens — standard LangChain / LlamaIndex default. Most real-world RAG runs at this size.
1024 tokens — used when context matters more than precision. Chunks here are less likely to fail self-containment.

If your 256-chunks score 40% average but 1024-chunks score 70%, the page has strong topic coherence but weak sentence-level self-containment. Rewrite openers; the average will lift across all sizes.

The fix workflow

Run the scorer. Toggle to 512-token view (RAG default).
Identify weak chunks (< 50%).
Copy the AI fix prompt. Paste into Claude.
Get rewrites — opener-fixed sentences you can search-and-replace into the page.
Rerun the scorer. Average should lift 10-20 percentage points per pass.

Fact-check notes and sources

LangChain chunk-splitting docs: python.langchain.com/docs/modules/data_connection/document_transformers
LlamaIndex node parsers: docs.llamaindex.ai
Embedding models & chunk size research: arxiv.org/abs/2311.08377

The $100 Network covers writing content that survives RAG chunking — direct openers, fact-density, no cross-paragraph references. The scorer is the verification pass.

RAG Splits Your Page Into Chunks — How To Make Each Chunk Retrievable

The five chunk signals

Why three sizes

The fix workflow

Related reading

Fact-check notes and sources

Send a Message