← Back to Blog

Why ChatGPT Cites Your Competitor And Not You: Four Writing Frameworks With The Research To Back Them

Why ChatGPT Cites Your Competitor And Not You: Four Writing Frameworks With The Research To Back Them

There is a frustrating pattern I keep seeing in AI search results. A page on my site ranks first in Google for a query. The same query, asked of ChatGPT or Claude or Perplexity, cites a different page. Sometimes a page that is technically less complete. Sometimes a page from a smaller site. The model picked the smaller site over mine, and I want to know why.

The honest answer turns out not to be a secret. It is the writing.

Two streams of public research now make the picking criteria fairly visible. Kevin Indig analyzed three million ChatGPT responses and around 30 million citations, isolating around 18,000 verified citations to look at what they had in common. (The science of how AI pays attention, Growth Memo) Dan Petrovic, in earlier work, mapped the parallels between how humans skim and how transformers attend: both are looking for a fast way to extract meaning without reading every word. (DEJAN on human-friendly content being AI-friendly content, Petrovic's broader work on retrieval and chunking)

Both pieces of research arrive at the same set of writing patterns. The four below are the most replicable, the most testable, and the most resistant to becoming "AI tricks" that get patched out next quarter, because they are also good writing for human readers.

1. Bottom line up front, in every section

The principle is simple: state your conclusion in the first one or two sentences of any section, then justify it.

The reason it works for AI is mechanical. Transformer attention weights early tokens more heavily than late ones, particularly in the kind of retrieval-augmented generation pipelines that ChatGPT, Perplexity, and Claude run at query time. The model is sampling a passage, and the passage's first sentence carries a disproportionate share of what the model concludes the passage is about.

Indig's data: 44.2 percent of ChatGPT citations come from the first 30 percent of the page. (Search Engine Land on the 44 percent finding) That is not "the first paragraph slightly more often." That is "almost half the citations come from the top third."

Bottom line up front (BLUF) was originally a U.S. military communication standard, then borrowed by management consulting for the same reason: a busy reader needs the answer before the supporting argument. Modern AI is the busiest reader yet. It samples maybe 30 percent of your page. If your strongest sentence is in paragraph seven, the model probably never sees it.

In practice this means three small changes:

  • Your introduction states the answer, the recommendation, or the headline finding in the first paragraph, not after a setup paragraph.
  • Each H2 section's first sentence delivers that section's main point. Not a transition. Not a question. The point.
  • Section headings themselves carry meaning. "Why this matters" is a label. "Why a 6-second timeout matters more than a 3-second one" is a heading that already starts to answer the question.

Test for it: read only the first sentence of every section in your post. If a reader can follow your argument from those sentences alone, you are doing it right. If the first sentences are throat-clearing, rewrite them.

2. Definite language, not hedged language

Indig's other clear finding: cited content uses definite, declarative phrasing about twice as often as uncited content (around 36 percent versus around 20 percent). Phrases like "is defined as," "refers to," "means," and direct subject-verb-object constructions outperform hedged phrasings like "may suggest," "could potentially," and "it seems likely that." (Indig research overview at Growth Memo)

The mechanism is again straightforward. A retrieval system is looking for short passages that answer the user's question. A confident assertion is a candidate answer. A hedged sentence is a candidate guess. When the model is choosing between two passages on the same topic, the more definite one is the more useful one to cite.

This frustrates writers from academic backgrounds, and I include myself. Academic prose is full of "may," "might," "appears to suggest," and "could potentially." That is not laziness. It is intellectual humility, baked into the conventions of journals where overclaiming gets your paper rejected.

Web writing for AI citation is a different context. The reader is not a peer reviewer. The reader is a model deciding whether your sentence answers a question. Definite phrasing wins.

A worked example. Take a sentence I see often:

"It seems possible that updating your title tags may help improve click-through rates in some search results."

Same idea, written for citation:

"Updated title tags improve click-through rates in search results."

Same factual content. Different citation odds. The hedged version contains nothing the model can confidently extract. The direct version is a complete, citable answer.

The advice is not "be wrong, but louder." It is "if you actually know something, say so." Reserve hedges for genuinely uncertain claims. Use definite phrasing for definitions, established findings, accepted best practices, and clear recommendations.

3. Entity density

Indig's third finding, and the one with the largest absolute effect size: cited passages contain entities (proper nouns, brand names, specific tools, specific places, specific concepts) at roughly three to four times the rate of standard English prose. The cited band averaged around 20.6 percent entity density. Standard English prose averages around 5 to 8 percent. Ordinary writing is mostly verbs, adjectives, and connecting words; cited writing is mostly nouns and named things. (Stradiji summary of Indig's research)

The mechanism is entity-graph alignment. Modern LLMs encode relationships between named things (Google, San Francisco, Wikipedia, Cloudflare, Section 1256) as connected nodes in a learned graph. A passage rich in entities is easier for the model to slot into that graph. A passage that says "the company in the area" instead of "Cloudflare in San Francisco" is harder to attach to anything specific the model already knows.

For human readers, the same density does the same job: specifics make writing readable, while abstractions make it forgettable. "Optimize your meta titles, internal links, and Core Web Vitals" lands. "Optimize various technical SEO factors" does not.

The applied rule:

  • Replace generic nouns with specific named things. "A tool" becomes "Ahrefs Brand Radar." "An exchange" becomes "Coinbase Advanced." "An open source library" becomes "vectorbt." "A research paper" becomes "Han, Kang, and Ryu (2023), SSRN 4675565."
  • Cite by name and link, not "research shows."
  • Use exact numbers where you have them. "Around 1 percent" beats "small."
  • Where a class of things is meant, name two or three examples. "Anti-bot tools like Cloudflare Bot Management, Akamai Bot Manager, and DataDome" beats "anti-bot tools."

You will not hit 20 percent entity density on every page. Few writers do. But moving from 5 percent to 12 percent is a single editing pass, and the effect on both citation odds and reader comprehension is dramatic.

4. Strategic repetition of key claims

The fourth pattern is the one that surprises writers, because it violates the high-school rule of never repeating yourself. Strategic repetition - placing the same idea in two or three places in a piece, rephrased for each context - increases citation odds because of how retrieval works.

The model does not read your whole page. It pulls passages. Each query fan-out generates a slightly different sub-question, which retrieves a slightly different passage. If your most important claim only exists in paragraph two, a sub-question that retrieves paragraph fourteen will not see it.

Repetition is insurance against retrieval-window mismatch. Three placements is the working pattern: the introduction, somewhere in the middle as a contextual reminder, and a closing reinforcement. Each version uses different phrasing, so the page does not read as redundant to a human, but the same idea gets multiple chances to land in whichever passage the model retrieves.

A worked example, on a single idea:

  • Introduction: "Internal linking is the most underrated lever in SEO."
  • Mid-article context: "This is exactly why internal linking matters more than backlinks for established sites: the authority is already there, and the question is where you point it."
  • Conclusion: "Backlinks get the glory, but for sites past the early stage, internal linking is the larger lever."

Three sentences, same insight, three retrieval surfaces. This is a slightly different discipline than BLUF (which is about position); repetition is about redundancy. Use them together.

A self-test you can run on any page in five minutes

Open one of your existing posts. Run these four checks:

Check 1, BLUF compliance. Read the first sentence of every H2 section. Do those sentences alone communicate the post's argument? If they read as setup or transition, rewrite each one to deliver the section's actual point.

Check 2, hedge density. Search the page for "may," "might," "could," "potentially," "perhaps," "it is possible," "appears to," and "seems." Count them. For every match, ask whether the underlying claim is genuinely uncertain. If it is not, rewrite as a direct assertion.

Check 3, entity density. Pick three random paragraphs. Count nouns. Count specific named entities (proper nouns, named tools, named research, named numbers). The ratio should be at least one named entity per two or three sentences. If a paragraph has zero named entities, it is probably the kind of generic writing that does not get cited.

Check 4, repetition map. What is the single sentence you most want a reader to walk away with? Is that sentence on the page in three different places, with three different phrasings? If it is on the page once, you have a retrieval-window risk.

Five minutes per post. Do it on the ten most important pages on your site. The lift in AI citation odds is real and shows up within a few weeks of publishing.

What this is not

A few honest disclaimers, because the AI-citation discourse has more snake oil than research.

These four frameworks do not get a non-ranking page cited. Indig's research, and Ahrefs' own ChatGPT-citation research, both show the same precondition: 76 percent of AI Overview citations come from the top 10 organic results, and 88 percent of ChatGPT-cited URLs come from ChatGPT's general search index. (Ahrefs on AI citation overlap with ranking) If your page is on result page five of Google, no amount of writing optimization makes it into ChatGPT. Rank first; then optimize for citation.

These frameworks are not "tricks" that game the system temporarily. They are how good non-fiction has always read. Military briefings have used BLUF for half a century. Encyclopedia entries have used definite phrasing forever. Good journalism uses entities at high density. Good textbooks repeat their key points. The reason the same patterns work for AI is that AI was trained on the same body of writing.

Finally, these frameworks do not replace clarity, accuracy, or original thinking. A page that is well-structured and well-cited but says nothing useful does not get cited at scale. The frameworks make a strong page legible to AI; they do not turn a weak page strong.

Related reading

Fact-check notes and sources

This post is informational, not a citation guarantee. AI search systems update their retrieval and ranking pipelines continuously. Patterns described above reflect public research current to mid-2026 and may evolve. Test on your own site, track citation behavior over time, and treat any single-month result as one data point rather than a trend.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026