Why track your mentions in AI models
Because your prospects no longer always type into Google. They ask ChatGPT "what is the best GEO agency in France" and read a three-paragraph synthetic answer. If your brand is not in it, you do not exist in that conversation. ChatGPT mention tracking measures exactly that: your presence in the answers AI models serve to your market.
The stakes are not theoretical. ChatGPT has more than 900 million weekly users, and more than half of Google queries now trigger an AI Overview. A growing share of purchase decisions forms before a single click, inside a generated answer.
The problem is that there is no Search Console for LLMs. Google tells you which queries you appear on. No model provider does. Your AI visibility is therefore a black box, unless you decide to shine a light on it yourself.
Tracking your mentions answers three operational questions. Are you cited? On which topics and how often? And above all: how, meaning with what sentiment and which supporting sources. These answers then guide your entire strategy as a GEO agency.
The manual method: prompts, spreadsheet, discipline
Start simple. The manual method costs nothing and forces you to understand how AI models talk about your market before automating anything.
The principle: you define a list of prompts representative of your prospects' queries, you run them at regular intervals in each target AI, and you log the answers in a spreadsheet. Three columns minimum: the prompt, the date, and the verbatim answer. Then add analysis columns.
Building your prompt list
Cover three families. Non-brand prompts ("best SEO agency Toulouse") test your discoverability. Brand prompts ("how good is LUWIZ") test your reputation. Comparative prompts ("LUWIZ or Eskimoz") test your positioning against competitors.
Managing variability
The same prompt produces different answers with each run. This is inherent to the models, which are probabilistic. Never draw a conclusion from a single answer. Repeat each prompt at least three to five times per session and reason in terms of frequency of appearance, not binary presence.
Write a list of 10 to 20 stable prompts. Do not change them anymore: any rephrasing breaks comparability over time.
Query at minimum ChatGPT, Perplexity and Gemini. Each draws on different sources and will cite you differently.
Run each prompt several times per session to measure a reliable frequency of appearance, not a stroke of luck.
Copy the full answer and the cited sources. The detail matters as much as raw presence.
Timestamp each session. A model update can flip your results overnight.
The manual method has an obvious limitation: it does not scale. Twenty prompts repeated five times across three models make three hundred runs per session. Beyond the initial test, you will need tools.
AI monitoring tools
AI monitoring platforms automate what you would do by hand: they query several models with your prompts, at a defined frequency, and archive everything. They turn three hundred weekly runs into a readable dashboard.
What a good tool measures: your presence rate per prompt, your average position in answers, the sources the AI cites to justify your mention, the associated sentiment, and your AI share of voice against competitors. Some also detect new sources that are starting to cite you, a valuable signal for your off-site strategy.
| Criterion | Manual tracking | Monitoring tool |
|---|---|---|
| Cost | Free (time) | Monthly subscription |
| Prompt volume | Limited, time-consuming | Hundreds, automated |
| Archiving | Manual spreadsheet | Automatic and dated |
| Source detection | Manual, partial | Systematic |
| Alerts | None | Real time |
Yet no tool reads the models' minds. They all work by sampling: they ask your prompts, as you would, and aggregate the answers. A tool's quality lies in the diversity of models covered, the query frequency and the granularity of sentiment analysis. Check these three points before you pay.
To measure your starting point with no commitment, you can use our AI Visibility Score: it provides a first snapshot of your presence in the main models.
Frequency, alerts and signals to watch
A weekly frequency is enough in most cases. LLM answers move slowly between two model updates, but they fluctuate with each generation. A weekly sample, large enough, smooths out this noise and reveals the real trends. Tighten to daily tracking only around a launch, a reputation crisis or a major update announced by a provider.
The four signals that matter
Do not stop at raw presence. Four signals describe your real situation.
Ahrefs' analysis of 200,000 domains (Dec. 2025) shows that off-site mentions — Wikipedia, Reddit, YouTube — correlate far more with AI citations (YouTube at 0.737) than Domain Rating (0.266). So watch which sources AI models cite about you.
Position: being mentioned at the top of an answer does not carry the same value as a citation in the last line. Sources: the AI relies on pages to justify you; identifying which ones tells you where to strengthen your off-site presence. Sentiment: being cited negatively is a warning signal, not a win. Share of voice: your frequency of appearance relative to that of your competitors, the only truly comparative indicator.
When to trigger an alert
A useful alert signals a change, not a state. Configure thresholds: disappearance from a prompt where you were present, the emergence of a competitor in a comparative answer, a sentiment shift, or a new source that starts citing you. To dig into the competitive dimension, set up a structured competitive LLM benchmark alongside your brand tracking.
From tracking to action
Measuring is useless if you do not act. ChatGPT mention tracking is a steering instrument, not an end. Each signal must lead to a decision.
You appear on no non-brand prompt? Your problem is discoverability: work on the sources AI models cite in your sector, and make sure your content is accessible to crawlers. A major technical reminder: LLMs do not execute JavaScript. If your content only exists after client-side rendering, it is invisible. SSR or static HTML is essential.
You are cited but poorly positioned? Strengthen the citability of your pages: self-sufficient passages of 134 to 167 words that directly answer a question. And deploy FAQPage schema, a strong signal for AI Overviews.
You appear with degraded sentiment? The problem is reputational, not technical. Identify the negative sources AI models pick up and address them at the root.
One last structuring reminder: only 11% of domains are cited by both ChatGPT and AI Overviews. Being visible on one model guarantees nothing on the others. Your tracking must stay multi-model, and your actions targeted according to where you are missing.
Our free GEO audit measures your real presence in ChatGPT, Perplexity and AI Overviews, and identifies your priority levers.
Questions fréquentes
Is there a Search Console for ChatGPT?+
No. No LLM provider offers an official interface that surfaces your citations, unlike Google Search Console. You have to build your own tracking system, manual or tooled, by querying the models with stable prompts and logging the answers.
How often should you track your mentions in AI models?+
A weekly cadence is enough for most brands. LLM answers evolve slowly between two model updates, but they vary with each run. A weekly sample of several repeated prompts smooths out this noise and reveals the real trends.
Why does ChatGPT give different answers to the same prompt?+
LLMs are probabilistic: they sample their answer with each generation. The same prompt therefore produces variable outputs. That is why serious tracking repeats each prompt several times and reasons in terms of frequency of appearance rather than a single occurrence.
Which signals should you measure beyond mere presence?+
Beyond simply being cited, measure the position in the answer, the sources the AI cites to justify your mention, the sentiment associated with your brand, and your share of voice against competitors. These signals indicate not only whether you exist, but how you are perceived.



