← jwatte.com

Vector Embedding Similarity

Copyscape catches byte-level plagiarism. It misses paraphrased near-duplicates. This tool computes cosine similarity on BOTH word-frequency and shingled-phrase fingerprints, catching the semantic overlap Google's systems see. Optional path to use a real embedding API (OpenAI, Voyage, Cohere) with your own key — key never leaves the browser.

Context and background

Read the story behind this tool: Byte-level plagiarism detection misses the overlap that hurts rankings →

Inputs

Or paste raw text instead: