A natural link graph has variation. Humans don't link to your pricing page with the same two-word anchor every time. Sometimes they write "see our plans." Sometimes "check the pricing." Sometimes just "here." The entropy of the anchor-text distribution is high because humans are high-entropy.
Automated internal linking — in a CMS template, a sidebar widget, a mass-replace sed command — produces low-entropy distributions. Every link to /pricing literally says pricing. Google's helpful-content signals treat this as a keyword-stuffing pattern and discount those links.
Anchor Text Entropy Scanner crawls your sitemap (up to 100 pages), aggregates every internal anchor per target URL, and computes the Shannon entropy of the distribution.
The three problems entropy math catches
Over-optimization (entropy < 1.5). The internal link set to one target URL has minimal variation. Often one exact-match anchor on 90% of the links. Fix: rewrite sidebar and in-content anchors to use synonyms, partial matches, descriptive phrases.
Generic-anchor flood (>30% generic anchors). Classics: "click here," "read more," "learn more," "see more." Google's link context model weighs descriptive anchors heavily; generic anchors carry no topical signal. Fix: rewrite sentences so the anchor is contextual.
Exact-match spam (>70% single anchor). Worst case of over-optimization: one anchor dominates to an absurd degree. Almost always template-driven. Template fix = site-wide cleanup.
Why Shannon entropy is the right measure
Shannon entropy quantifies the "surprise" of a distribution. Maximum entropy = uniform distribution across N variants (every anchor different). Zero entropy = one anchor, always. The scanner computes -Σ(p_i × log2(p_i)) for each target URL's anchor set.
For a healthy internal-link anchor distribution, expect entropy in the 2.0-3.5 range. Under 1.5 is the over-optimization tier. Zero is pathological (literally every link uses the same anchor, which is a content-template bug).
What to do after the scan
The AI fix prompt emits per-target anchor-variation suggestions: 5-8 alternate phrasings that cover synonyms, partial matches, and descriptive contexts. Hand that to a content editor and the rewrite happens in 2-3 hours per 100 target URLs.
Related reading
- Internal Link Auditor — finds 404s + 301 chains in internal links
- Link Graph — orphan + hub visualization
Fact-check notes and sources
- Shannon entropy: en.wikipedia.org/wiki/Entropy_(information_theory)
- Google on link-crawlability and anchor text: developers.google.com/search/docs/crawling-indexing/links-crawlable
- Helpful Content Update: developers.google.com/search/updates/helpful-content-update
The $100 Network covers internal-linking strategies at network scale, where one bad template propagates to 1000 pages. The scanner is the audit to catch it.