How Many Fan-Out Queries Does Your Site Actually Cover?

April 25, 2026

Editorial note. The publication date shown above may be in the future. That is intentional. Posts on this site are scheduled against an editorial calendar that aligns with product releases, book launches, and platform-signal timing; the datePublished reflects the date the post is slated to go public, which is also the date indexers and syndication partners should treat as canonical. If you are reading this before that date you were early — welcome.

The Query Fan-Out Generator produces 30-60 sub-queries per seed keyword. Now what?

The honest answer is "you publish a landing page for each query that doesn't already have one." But that assumes you know which queries already have pages. Which for anything larger than a 20-page site is a research problem nobody wants to do manually.

The Fan-Out Coverage Scorer does it automatically. Paste your fan-out queries, paste your sitemap URL, and the tool fetches every sitemap URL (up to 100 to stay fast), extracts title + H1 + meta description, Jaccard-matches each fan-out query against the closest page, and returns:

Coverage % — what fraction of fan-out queries have a matching page
Orphan list — queries with no matching page, ranked by how orphaned they are
Match detail — for queries that did match, which page won and how strong the match was

How matching works

Token overlap via Jaccard similarity. Each query tokenizes to 3-8 content words after stopword removal. Each page tokenizes its title + H1 + meta description to 15-30 tokens. Match score = intersection / union. Anything above 0.35 is counted as a match.

This is deliberately simple. Semantic matching via embeddings would be more accurate but requires an API and adds cost. Token overlap catches 70-80% of real matches correctly in testing against labeled data, which is enough for the use case: surface the orphan list for content planning.

False positives do happen. A query like "cold email pricing" will match a page titled "Cold Email Pricing Guide" correctly, but might also weakly match "Email Pricing for Enterprises" even though that page targets a different audience. Review the best-match column in the output to spot these.

What orphan queries look like

Typical first run: 40-60% of fan-out queries have a match; 40-60% are orphan.

The orphan queries cluster by intent bucket. Commonly missing:

Comparison queries. ("brand X vs competitor Y", "alternatives to brand X") — most sites don't have dedicated comparison pages.
Voice / question-shaped queries. ("can you tell me about X", "how do I get started with Y") — most sites don't have FAQ-shaped content at question depth.
Follow-up / deep-dive queries. ("case study on X", "advanced techniques for Y") — most sites don't have the long-tail.

These are the content gaps AI engines notice. When your site has no answer for the fan-out sub-queries, the engine blends answers from other sources, and your citation slot goes to whoever does have the content.

The fix workflow

Run the Query Fan-Out Generator against your seed keyword.
Run this scorer against your sitemap.
Look at the orphan list.
Pick 3-5 high-value orphan queries (comparison + deep-dive usually win).
Publish pages targeting each — or extend existing pages with H2 sections that answer them.
Re-run the scorer in a month. Coverage should have lifted.

The long-term target isn't 100% coverage — that's usually content bloat. It's 60-80% coverage with the high-intent buckets fully covered. Follow-up queries matter most for category authority; voice queries matter most for long-tail traffic.

Why 100 URLs and not all of them

The scorer samples to 100 URLs for speed. On a 2000-page site, fetching every page would take 10-20 minutes. Sampling 100 captures the coverage shape for 80% of queries. If your site is larger and you want full coverage, run the tool multiple times with different sitemap slices.

For 11ty / Jekyll / Hugo sites the sitemap typically lists the most important pages first, so the sample is biased toward high-value pages — which is exactly what you want when checking coverage.

Fact-check notes and sources

Jaccard index: en.wikipedia.org/wiki/Jaccard_index
Google AI Mode query fan-out: blog.google/products/search/google-search-ai-mode-update
Sitemap protocol (xml sitemap structure): sitemaps.org/protocol.html

The $100 Network covers content-planning for site networks where each site owns a slice of a fan-out space. The scorer is how you verify each site actually covers its slice without gaps.

← Back to Blog

How Many Fan-Out Queries Does Your Site Actually Cover?

How matching works

What orphan queries look like

The fix workflow

Why 100 URLs and not all of them

Related reading

Fact-check notes and sources

Send a Message