Glossaire · IA

RAG (Retrieval-Augmented Generation)

RAG (Retrieval-Augmented Generation) is a technique that lets a language model retrieve relevant documents at the moment of answering, then generate its response from those sources. This mechanism decides which content an AI cites.

RAG (Retrieval-Augmented Generation) is a technique that combines two steps: first retrieving relevant documents from a database or the web, then generating an answer from those documents. It's the mechanism that lets an AI like Perplexity cite up-to-date sources rather than answering purely from memory.

How RAG works

When a user asks a question, the system turns that question into a query, retrieves the most relevant passages, then feeds them to the LLM as context. The model then writes its answer based on those excerpts — and cites them. The quality and structure of your content determine whether it makes it through the retrieval step.

A concrete example

An AI receives the question "how do I optimize for AI Overviews?". Its RAG engine fetches the most relevant pages, extracts self-contained passages, and composes its answer from those that respond most directly. An "X is…" block of 150 words is the ideal length to be retained.

Why it matters

RAG is the bridge between your content and the AI's answer. Optimizing for retrieval is the heart of GEO: citable passages, schema markup and topical authority. To structure this approach, see our GEO agency.

Key takeaway

In RAG, if you're not retrieved, you're not cited. Retrieval comes before everything.

FAQ

Frequently asked questions

Because the retrieval phase of RAG decides which documents the AI reads before answering. If your page isn't retrieved, it can't be cited. Clear, structured and well-ranked content increases its chances of being selected.

Go further