← Back to Blog

How Many Fan-Out Queries Does Your Site Actually Cover?

How Many Fan-Out Queries Does Your Site Actually Cover?

The Query Fan-Out Generator produces 30-60 sub-queries per seed keyword. Now what?

The honest answer is "you publish a landing page for each query that doesn't already have one." But that assumes you know which queries already have pages. Which for anything larger than a 20-page site is a research problem nobody wants to do manually.

The Fan-Out Coverage Scorer does it automatically. Paste your fan-out queries, paste your sitemap URL, and the tool fetches every sitemap URL (up to 100 to stay fast), extracts title + H1 + meta description, Jaccard-matches each fan-out query against the closest page, and returns:

  • Coverage % — what fraction of fan-out queries have a matching page
  • Orphan list — queries with no matching page, ranked by how orphaned they are
  • Match detail — for queries that did match, which page won and how strong the match was

How matching works

Token overlap via Jaccard similarity. Each query tokenizes to 3-8 content words after stopword removal. Each page tokenizes its title + H1 + meta description to 15-30 tokens. Match score = intersection / union. Anything above 0.35 is counted as a match.

This is deliberately simple. Semantic matching via embeddings would be more accurate but requires an API and adds cost. Token overlap catches 70-80% of real matches correctly in testing against labeled data, which is enough for the use case: surface the orphan list for content planning.

False positives do happen. A query like "cold email pricing" will match a page titled "Cold Email Pricing Guide" correctly, but might also weakly match "Email Pricing for Enterprises" even though that page targets a different audience. Review the best-match column in the output to spot these.

What orphan queries look like

Typical first run: 40-60% of fan-out queries have a match; 40-60% are orphan.

The orphan queries cluster by intent bucket. Commonly missing:

  • Comparison queries. ("brand X vs competitor Y", "alternatives to brand X") — most sites don't have dedicated comparison pages.
  • Voice / question-shaped queries. ("can you tell me about X", "how do I get started with Y") — most sites don't have FAQ-shaped content at question depth.
  • Follow-up / deep-dive queries. ("case study on X", "advanced techniques for Y") — most sites don't have the long-tail.

These are the content gaps AI engines notice. When your site has no answer for the fan-out sub-queries, the engine blends answers from other sources, and your citation slot goes to whoever does have the content.

The fix workflow

  1. Run the Query Fan-Out Generator against your seed keyword.
  2. Run this scorer against your sitemap.
  3. Look at the orphan list.
  4. Pick 3-5 high-value orphan queries (comparison + deep-dive usually win).
  5. Publish pages targeting each — or extend existing pages with H2 sections that answer them.
  6. Re-run the scorer in a month. Coverage should have lifted.

The long-term target isn't 100% coverage — that's usually content bloat. It's 60-80% coverage with the high-intent buckets fully covered. Follow-up queries matter most for category authority; voice queries matter most for long-tail traffic.

Why 100 URLs and not all of them

The scorer samples to 100 URLs for speed. On a 2000-page site, fetching every page would take 10-20 minutes. Sampling 100 captures the coverage shape for 80% of queries. If your site is larger and you want full coverage, run the tool multiple times with different sitemap slices.

For 11ty / Jekyll / Hugo sites the sitemap typically lists the most important pages first, so the sample is biased toward high-value pages — which is exactly what you want when checking coverage.

Related reading

Fact-check notes and sources


The $100 Network covers content-planning for site networks where each site owns a slice of a fan-out space. The scorer is how you verify each site actually covers its slice without gaps.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026