# Cut The Filler Paragraphs — Measuring Information Payload Per Paragraph

A 2,000-word article with one dense paragraph and twenty filler paragraphs ranks worse than a 600-word article that&#39;s dense the whole way through. This scorer measures information payload per paragraph — noun-phrase density, entities per 100 words, specific-claim markers (numbers, dates, money), filler ratio — and flags paragraphs to cut.

Author: J.A. Watte
Published: May 10, 2026
Source: https://jwatte.com/blog/blog-tool-paragraph-semantic-density/

---

There are two kinds of 2000-word articles. The first has twenty dense paragraphs — each naming specific entities, citing numbers, making concrete claims. The second has one good paragraph and nineteen filler — "it's important to note," "at the end of the day," "needless to say," followed by generic advice repeated in slightly different words.

Google's helpful-content system distinguishes between them. Users distinguish between them. AI answer engines pick the dense one to cite.

[Paragraph Semantic Density](/tools/paragraph-semantic-density/) measures the information payload of every paragraph on your page.

## The five signals per paragraph

**Noun-phrase density.** Count of multi-word capitalized spans + "the X of Y" patterns per 100 words. High NP density = the paragraph is saying specific things about specific named entities, not floating generalities.

**Entity density.** Proper noun count per 100 words. Numbers, brand names, place names, person names. Entity-dense paragraphs are information-rich; entity-free paragraphs are opinion-mush.

**Specific-claim markers.** Count of numbers (`42`, `3.2%`, `2026`), year markers, and money amounts (`$99`, `€15`) per 100 words. Specific claims anchor a paragraph to verifiable reality.

**Filler ratio.** `(stopwords + hedges × 3 + filler-phrases × 10) / word count`. Hedges like "probably," "basically," "essentially" are weighted higher because they signal the author isn't committed to their own claims. Filler phrases like "needless to say" get 10× weight because their entire function is to fill space.

**Word count floor.** Paragraphs under 40 words usually don't carry enough payload to score high even if they're dense. Merge or expand.

## What to do with the filler list

The AI rewrite prompt emits per-paragraph dispositions: **cut / merge / rewrite**. Cut when the paragraph is pure hedge-and-stopword content. Merge when it's a short transitional note that fits into the paragraph above or below. Rewrite when the paragraph says something true but vaguely — rewrite with one specific example, one number, one named entity.

Most articles can lose 20-30% of their paragraphs without losing meaning. The remaining paragraphs get a density lift just from no longer being diluted.

## Why not just trust Flesch + word count

[Voice & Tone](/tools/voice-tone/) measures Flesch and passive-voice ratio — readability concerns. This tool measures something orthogonal: information payload. A highly-readable page can still be full of filler; a dense page can still be readable. They're different axes.

## Related reading

- [Voice Cleanup](/tools/voice-cleanup/) — de-slop after density cleanup
- [Voice & Tone](/tools/voice-tone/) — readability + passive voice
- [Passage Retrievability](/tools/passage-retrievability/) — paragraph-level retrieval scoring

## Fact-check notes and sources

- Helpful Content Update: [developers.google.com/search/updates/helpful-content-update](https://developers.google.com/search/updates/helpful-content-update)
- Information theory primer: [en.wikipedia.org/wiki/Information_content](https://en.wikipedia.org/wiki/Information_content)

---

*The $20 Dollar Agency covers editing-against-density as a skill. The tool is how you teach an editor to see filler without the editor already being world-class.*


---

Canonical HTML: https://jwatte.com/blog/blog-tool-paragraph-semantic-density/
RSS: https://jwatte.com/feed.xml
JSON Feed: https://jwatte.com/feed.json
Hero image: https://jwatte.com/images/blog-tool-paragraph-semantic-density.webp
