How Claude retrieves and cites its sources
Claude cites its sources through two distinct channels, and the distinction changes everything. The first: built-in web search, which queries a real-time index and displays clickable links. The second: answers generated from its training corpus, with no link, where only your accumulated reputation makes you exist.
When a user enables web search, Claude rephrases the query, retrieves a selection of pages, reads their text content, then synthesizes an answer while citing the URLs it kept. Anthropic's crawler, ClaudeBot, must therefore be able to access your pages. If it is blocked in robots.txt, you disappear from this channel.
Without web search, Claude answers from what it learned during training. No source is displayed, but your brand can be named if it appears often enough in trusted content. This is where the underlying battle is fought.
In both cases, one factor dominates: citability. Claude favors self-contained passages that answer a precise question without depending on the previous paragraph. A factual, complete block of 134 to 167 words has a far better chance of being extracted than prose that unspools across three screens.
What sets Claude apart from ChatGPT and Gemini
Claude is not a clone of ChatGPT with another logo. Three structural differences change your approach.
First, the expected tone. Claude is built for nuance and rigor. It values content that lays out the conditions, the limits, the edge cases. An article that asserts without proving is less likely to be picked up than content that shows its reasoning.
Next, the source of the index. Claude's web search relies on an index distinct from Google's. What surfaces in Google's AI Overviews does not necessarily surface in Claude, and vice versa. It is documented: only 11% of the domains cited by ChatGPT are also cited by AI Overviews. The overlap between AI engines remains low, which forces you to optimize for each one.
| Criterion | ChatGPT / web search | Claude / web search |
|---|---|---|
| Audience volume | 900M+ users/week | Smaller audience, professional and technical use |
| Favored content tone | Direct, lists, quick answers | Nuance, rigor, explicit conditions |
| Strong off-site sources | Wikipedia, Reddit, YouTube | Wikipedia, press, technical documentation |
| Required access | OpenAI crawler allowed | ClaudeBot allowed in robots.txt |
| Overlap with Google | 11% of domains shared | Distinct index, low overlap |
Finally, the usage profile. Claude is heavily used in professional and technical contexts. Your B2B content, your methodological analyses and your precise documentation carry greater relative weight there. The logic stays the same as for getting cited by ChatGPT, but the high-signal content differs.
Making your pages readable by Claude
No optimization matters if Claude cannot read your page. The first rule is non-negotiable: LLMs do not execute JavaScript. Content loaded client-side, after rendering, is invisible to the crawler. Server-side rendering or static HTML is the entry condition.
Once the page is readable, structure makes the difference. Claude extracts passages, not whole pages. The more your blocks are segmented, with explicit headings and self-contained, the more citable they are.
Check that your content appears in the raw source code, without JavaScript execution. Disable JS in your browser and reload: what remains is what Claude reads.
The first sentence of each H2 must answer the implicit question in the heading. Claude reuses these openings as ready-to-cite answers.
This is the optimal length for a citable block: long enough to be complete, short enough to be extracted as is. Break up paragraphs that run too long.
Structured markup is a strong signal for generated answers and AI Overviews. It helps Claude identify your question-answer pairs as citable units.
Confirm that your robots.txt does not block Anthropic's crawler. Without access, no citation is possible via web search.
These technical fundamentals overlap with those of any AI visibility strategy. A GEO agency systematically starts with this accessibility audit before touching the content, because optimizing a page the engine cannot see makes no sense.
Becoming a citable source off-site
A citation from Claude is earned as much off your site as on it. This is the most counterintuitive point, and the most decisive for the corpus channel, where no link is displayed.
The Ahrefs analysis of 200,000 domains (December 2025) is unequivocal: off-site brand mentions correlate far more with AI citations than classic domain authority. The correlation of mentions on YouTube reaches 0.737, versus only 0.266 for Domain Rating. In other words, being named everywhere weighs more than the number of backlinks.
Encyclopedic and community sources dominate the references of LLMs. Being present, and correctly described, on Wikipedia, Reddit and trusted platforms anchors your brand in the corpus, far beyond your own site.
The logic holds for Claude, which trains on corpora where these same sources weigh heavily. Your off-site program therefore targets three things: encyclopedic and documentary references, expert discussions on forums and communities, and press mentions on recognized media in your sector.
One detail matters: the consistency of your naming. If your brand, your expertise and your positioning are described the same way everywhere, Claude associates these signals with one another. Fuzzy naming dilutes the signal. The clearer and the more identically repeated your identity is, the more it imprints.
Measuring and steering your visibility in Claude
You can only steer what you measure, and measuring visibility in Claude remains a craft in 2026. No official tool surfaces your citations exhaustively. Structured manual tracking is the benchmark method.
Build a list of real queries from your audience, the questions they would actually ask Claude. Ask them with web search enabled, then without, to test both channels. Note whether your brand is cited, with or without a link, and at what position in the answer. Repeat the exercise at regular intervals to detect trends.
This tracking tells you where to act. Cited via web search but not in the corpus? Strengthen off-site. Never cited, even with web search? Revisit technical accessibility and passage citability. Cited by ChatGPT but not by Claude? Shift the tone toward more rigor and nuance.
Keep the scale of the game in mind. More than 50% of Google queries already trigger an AI Overview, and 92% of the citations that appear there come from the top 10, including 47% from positions 5 to 10. Classic SEO remains the foundation, but it is no longer enough on its own. Visibility is now built across several engines in parallel.
To structure this approach, our 40-point Getting Cited by ChatGPT Checklist covers the fundamentals transferable to Claude: accessibility, citability, schema and off-site signals.
Our free GEO audit pinpoints exactly why, and the priority actions to become a source Claude chooses.
Questions fréquentes
Does Claude read content loaded via JavaScript?+
No. Like other LLMs, Claude's search engine does not render JavaScript. If your content only appears after client-side execution, Claude cannot see it. Server-side rendering (SSR) or static HTML is essential to be read, then cited.
How do I know if Claude cites my site?+
Ask Claude the real questions your audience would ask with web search enabled, and note which URLs it displays as sources. Repeat the exercise across your strategic queries. No official analytics tool reliably surfaces these citations yet, so manual tracking remains the benchmark.
Does FAQPage schema help with getting cited by Claude?+
Yes. Structured markup, especially FAQPage, clarifies the meaning of your content and acts as a strong signal for generated answers, including AI Overviews. It makes it easier to extract short, self-contained passages that Claude can reuse directly.
Should I block Anthropic's crawler in robots.txt?+
No, not if you are aiming for AI visibility. The ClaudeBot user-agent must be able to access your pages so Claude can retrieve and cite them. Blocking this crawler means deliberately excluding yourself from answers Claude generates with web search.



