← Back to Blog

AI Referral Traffic Without a Paid Analytics Subscription

AI Referral Traffic Without a Paid Analytics Subscription

Every hit from ChatGPT, Perplexity, Gemini, Copilot, Claude, You.com, Grok, and DuckDuckGo's AI mode leaves a Referer header. That header lands in your Nginx, Apache, Cloudflare, or Netlify access log. The only thing between you and "which AI engines drive traffic to my site" is a parser that understands the referer field and buckets by origin. That's a 50-line JavaScript program.

The AI Referrer Log Parser is that program. Paste an excerpt of your access log, get per-engine hit counts, top landing URLs, human-UA vs bot-UA split, and a CSV export.

Why this is free

Paid AI-traffic analytics tools — Profound, Scrunch, Otterly, Indexly — charge $99-$299/month. The methodology is not secret: they tail your CDN logs and filter on referer hostname. Sometimes they add GTM/GA4 hooks to catch client-side referer data.

The parser does the same classification: a regex matches the referer field, compares against the known AI engine hosts, increments a counter per match. No server round-trip, no subscription, no API key.

What the tool doesn't do (and neither do most paid tools honestly): attribute conversions. Log referers tell you "a visitor arrived from ChatGPT." They don't tell you whether that visitor bought anything. For conversion attribution, pair with GA4 + the GA4 LLM Referral tool.

What the output shows

For each AI engine with at least one hit:

  • Hit count. How many log lines had a referer from that engine.
  • Human UA share. The fraction of hits whose user-agent looks like a browser (Mozilla, Chrome, Safari) vs. looks like a bot. AI engines fetch pages two ways: a server-side retrieval by their bot (shows up with a bot UA), and a human-clicked link from the AI answer citation (shows up with a browser UA). The second is the one that matters for conversion.
  • Top landing URLs. Which five URLs each AI engine sent the most traffic to. This is where you learn which of your pages are being cited in AI answers.

CSV export lets you merge with GA4 or time-series a rolling baseline.

What's actually in the log

The parser handles four log shapes: Nginx combined, Apache combined, Cloudflare/CDN JSON, Netlify. In all four the referer field is roughly the second-or-third quoted string per line, or a referer / http_referer key in JSON.

Expect noise. Bots spoof referrers. Some AI engines fetch through CDN proxies that mask origin. Some log formats strip the referer entirely for privacy. The parser reports unparsed lines so you know what fraction it couldn't classify.

A useful baseline: run the parser against the last 30 days of logs, write down the per-engine numbers, then re-run monthly. The absolute numbers matter less than the trend line.

Related reading

  • GA4 LLM Referral — the GA4 audience + Looker Studio filter + GTM trigger + BigQuery SQL for tracking the same signal inside Google Analytics.
  • AI Visibility Prompt Pack — companion manual-testing tool for measuring whether your brand shows up in AI answers at all.
  • Share-of-Voice Worksheet — counts mention-rate in pasted AI responses.
  • AI Posture Audit — upstream check: are AI bots even reaching your site? If this parser returns zero hits, robots.txt might be the cause.

Fact-check notes and sources

  • Indexly AI Traffic Analyzer marketing page: indexly.ai
  • Profound traffic-tracking marketing page: tryprofound.com
  • ChatGPT referer hostnames: chat.openai.com, chatgpt.com
  • Perplexity referer: perplexity.ai
  • Google Gemini referer: gemini.google.com
  • Microsoft Copilot referer: copilot.microsoft.com
  • Anthropic Claude referer: claude.ai

The $100 Network covers site-network analytics where one log-parsing pass across 20 domains is faster than 20 subscriptions. The parser is that pass, shrunk to a browser.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026