Empty category page. "Sorry, no products match your filter." HTTP 200. Google crawls it, classifies it as a soft-404, drops it from the index, and on the way out lowers the site's overall quality score by a fraction.
Multiply by 200 empty filter combinations. Multiply by 50 discontinued products that still resolve to a "this product is no longer available" page. Multiply by 20 author archives where the author has never posted. The site is bleeding crawl budget and reputation on pages that return 200.
The HTTP status was the wrong signal. Google's not stupid. The crawler knows what an empty page looks like.
What the Soft 404 Content Quality Overlay does
You paste up to 20 URLs. The tool:
- Fetches each through the proxy.
- Strips boilerplate (nav, footer, scripts).
- Counts words in the remaining content.
- Scans for "not found" phrases — "no results", "page not found", "no longer available", "404", "coming soon", "lorem ipsum", placeholder language.
- Inspects title and H1 for soft-404 patterns.
- Counts images and internal links as quality signals.
- Computes a content-quality score (0-100) per URL.
- Recommends action per URL: 301-or-410, expand content, monitor, or OK.
- Emits an AI prompt with per-URL remediation drafts.
The four soft-404 signatures
1. Thin + "no results" phrase. The classic. Empty filter or search-result page returning 200 with "no results found." Score under 30. Action: remove from index (noindex), 410, or redirect to a canonical category.
2. Discontinued / placeholder. "This product is no longer available." 80 words. Score 30-50. Action: 301 to closest replacement product or 410 if no replacement.
3. Coming-soon / under-construction. A placeholder shipped to production, never replaced. Score under 40. Action: noindex until real content lands, then remove noindex.
4. Stub category / tag with one item. A taxonomy page that has only 1-3 child items, generating a near-empty page wrapped in nav. Score 50-65. Action: consolidate small taxonomies or noindex thin tag pages.
What the score thresholds mean
- 75+ — substantial content. Not a soft-404.
- 55-75 — borderline. Worth monitoring; could degrade.
- 30-55 — thin but recoverable. Expand content.
- Under 30 — likely soft-404. Redirect or 410.
When to redirect vs 410
A 301 redirect transmits PageRank to the destination. A 410 explicitly tells Google "this is gone, drop it." The choice depends on whether the URL has external value:
- 301 — the URL has external backlinks OR the URL is a former product that still gets organic traffic. Redirect to closest equivalent.
- 410 — the URL has no external value AND there's no logical destination. Tell Google to forget it.
- Noindex (with 200) — the page must stay live for users (e.g., a specific user-account page) but should leave the index.
The audit's AI prompt asks the model to make this call per URL based on context.
Why soft-404s are getting more punitive in 2026
Google's HCU (Helpful Content Update) and subsequent core updates have explicitly weighted "site-wide content quality" higher. A site with 30% soft-404s among its indexable URLs gets ranked as a low-quality site overall — even on its high-quality pages.
Cleaning up soft-404s isn't just about reclaiming the bad URLs. It's about lifting the average quality of what Google sees.
The 30-day soft-404 cleanup path
Week 1: Pull "Crawled — currently not indexed" and "Discovered — currently not indexed" from GSC. Run the audit on the top 20 of each.
Week 2: Triage findings. Decide redirect/410/expand for each. Build the redirect map.
Week 3: Implement. Server config (Netlify _redirects, .htaccess, Nginx try_files), CMS noindex flags, content expansion for recoverable URLs.
Week 4: Re-submit sitemap (with soft-404 URLs removed). Re-request indexing in GSC for the cleaned-up + remaining indexable URLs.
By day 60, GSC's indexed-page count should be cleaner: fewer "currently not indexed" entries, more pages in "indexed."
Related reading
- Soft 404 Detector — the predecessor tool, single-URL focused
- Index Coverage Delta — sitemap vs crawl gap detection
- Duplicate Content Fingerprint — pairs with this for index-cleanup
- Mega SEO Analyzer — full sweep including soft-404 detection
Fact-check notes and sources
- Google soft-404 documentation: Google Search Central — Soft 404 errors
- Helpful Content Update guidance: Google Search Central — Helpful content update
- 410 vs 404 vs 301 decision logic: Google Search Central — Block search indexing with noindex
- Score thresholds (under 30 / 30-55 / etc.): heuristic — the real Google classifier weighs additional signals (clickstream, user engagement, machine-learned content quality)
This post is informational, not SEO-cleanup-consulting advice. Mention of Google is nominative fair use. No affiliation is implied.