← jwatte.com

AI Hallucination Detector

Every major LLM makes factual errors about brands. Today ChatGPT might say your hours are 9–5 when they're 7–8. This tool captures your canonical facts, accepts the paste-in response from any LLM, diffs them, scores severity, and exports a correction plan. Paid monitoring services charge $200–500/month for this. Yours is free.

Context and background

Read the story behind this tool: When LLMs get your brand wrong: how to measure and correct →

Step 1 — Declare your canonical facts

One fact per row. Include the short key (what it is), the canonical value (ground truth), and optional synonyms the LLM might paraphrase as. Save to localStorage automatically.

Step 2 — Generate prompts for each LLM

Copy each prompt and run it against the LLM in a fresh chat (no prior brand context). Then paste the response below.

Step 3 — Paste LLM responses

One response panel per model. The diff runs automatically when you paste into any panel.