The math: a category page with N facet filters, each offering K values, can produce (K+1)^N unique URLs. Add sort, page, and utm parameters and the number explodes into five digits fast. For most e-commerce sites, one popular category holds more combinatorial URLs than Googlebot can crawl in a month.
Facet Trap Detector scans a sample category URL, extracts every facet link, and does the combo math.
Why single-param crawl-waste auditing misses this
Param Crawl-Waste already counts parameter frequency across a sitemap. It catches ?sort=price appearing on 500 URLs. What it doesn't do is compute what happens when ?sort=price combines with ?color=red and ?size=M and ?page=2 and ?utm_source=email. The math is different: single-param is additive, combinatorial is multiplicative.
For one small e-commerce site I tested: param-crawl-waste reported 18% crawl waste. Facet-trap reported that one category page alone could generate 4,200 URLs for 65 products — a 65:1 ratio. Both are right; they're measuring different axes.
The classification algorithm
Every facet parameter needs one of three dispositions:
- Block.
utm_*,fbclid,gclid,ref, session tokens, sort, page, view-mode. These should never be indexed. robots.txtDisallow: /*?utm_*etc. - Canonicalize.
color,size,brand,materialfacets. The variants should exist (users filter them), but the canonical tag on the variant page should point to the unfiltered category. Google will still crawl, but won't index. - Index. A specific facet combination that targets a real search query. "Red T-Shirts" is a query people search for, so
?color=redshould be its own indexable page with a unique title + H1.
Most e-commerce sites ship 100% of facets as indexable. The correct distribution is 80% block, 15% canonicalize, 5% index.
What the fix buys you
On a mid-sized catalog site, fixing facet-trap recovers 20-40% of crawl budget. That budget gets redirected to actual product pages, new arrivals, and seasonal landing pages that weren't getting crawled before. Typical result: 15-30% more organic traffic within 60 days, just from more pages getting indexed.
Related reading
- Param Crawl-Waste — single-param frequency
- Index Coverage Delta — what's actually indexed
- Sitemap Audit — sitemap hygiene
Fact-check notes and sources
- Google faceted navigation guidelines: developers.google.com/search/docs/specialty/ecommerce/faceted-navigation
- Consolidate duplicate URLs: developers.google.com/search/docs/crawling-indexing/consolidate-duplicate-urls
- Search Central "crawl budget" guidance: developers.google.com/search/blog/2017/01/what-crawl-budget-means-for-googlebot
The $20 Dollar Agency covers e-commerce client audits where facet-trap is the most common crawl-budget win. The detector is the first-pass diagnostic.