← jwatte.com

LLM Tokenizer Efficiency

Not all content tokenizes equally. Pages heavy with code blocks, punctuation, or numerical tables cost 15-25% more tokens than clean prose for the same information. At scale, the difference is real money. This tool measures tokens-per-word efficiency across the five major tokenizer families.

Context and background

Read the story behind this tool: Tokenize efficiently, retrieve affordably →

Input

URL (fetches body)

Or paste text directly: