← Back to Blog

How One E-Commerce Category Page Spawns 10,000 Crawl-Waste URLs

How One E-Commerce Category Page Spawns 10,000 Crawl-Waste URLs

The math: a category page with N facet filters, each offering K values, can produce (K+1)^N unique URLs. Add sort, page, and utm parameters and the number explodes into five digits fast. For most e-commerce sites, one popular category holds more combinatorial URLs than Googlebot can crawl in a month.

Facet Trap Detector scans a sample category URL, extracts every facet link, and does the combo math.

Why single-param crawl-waste auditing misses this

Param Crawl-Waste already counts parameter frequency across a sitemap. It catches ?sort=price appearing on 500 URLs. What it doesn't do is compute what happens when ?sort=price combines with ?color=red and ?size=M and ?page=2 and ?utm_source=email. The math is different: single-param is additive, combinatorial is multiplicative.

For one small e-commerce site I tested: param-crawl-waste reported 18% crawl waste. Facet-trap reported that one category page alone could generate 4,200 URLs for 65 products — a 65:1 ratio. Both are right; they're measuring different axes.

The classification algorithm

Every facet parameter needs one of three dispositions:

  1. Block. utm_*, fbclid, gclid, ref, session tokens, sort, page, view-mode. These should never be indexed. robots.txt Disallow: /*?utm_* etc.
  2. Canonicalize. color, size, brand, material facets. The variants should exist (users filter them), but the canonical tag on the variant page should point to the unfiltered category. Google will still crawl, but won't index.
  3. Index. A specific facet combination that targets a real search query. "Red T-Shirts" is a query people search for, so ?color=red should be its own indexable page with a unique title + H1.

Most e-commerce sites ship 100% of facets as indexable. The correct distribution is 80% block, 15% canonicalize, 5% index.

What the fix buys you

On a mid-sized catalog site, fixing facet-trap recovers 20-40% of crawl budget. That budget gets redirected to actual product pages, new arrivals, and seasonal landing pages that weren't getting crawled before. Typical result: 15-30% more organic traffic within 60 days, just from more pages getting indexed.

Related reading

Fact-check notes and sources


The $20 Dollar Agency covers e-commerce client audits where facet-trap is the most common crawl-budget win. The detector is the first-pass diagnostic.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026