Glossaire · GEO

Token

A token is the basic unit that a large language model (LLM) works with: a fragment of text that can represent a whole word, part of a word, a single character, or a punctuation mark. Before processing a query or a document, the model splits the text into tokens through a process called tokenization, then converts each token into a numerical vector. In English, a common word often maps to a single token, while a long, rare, or technical term breaks into several tokens. Models reason and bill in tokens, not words: their context window, their input and output limits, and the cost of API usage are all measured in token counts. For GEO, understanding the token clarifies how an LLM reads, segments, and weighs the content of a page before deciding which passages to cite in its answers.

The token is the elementary building block through which every large language model perceives text. Where a human reads words and sentences, the model only sees a sequence of tokens: standardized fragments it learned to recognize during training.

How it works

Before any computation, the model applies tokenization to the text it receives. A tokenizer splits the character string into tokens based on a fixed vocabulary, usually learned through statistical compression (Byte-Pair Encoding-type algorithms). Each token is then mapped to a numerical identifier and transformed into an embedding, a vector the model can manipulate mathematically.

The split is not intuitive. The word "optimization" may become a single token or break into "optim" + "ization" depending on the tokenizer. Spaces, punctuation, and capitalization count too. This is why the same content does not occupy the same number of tokens from one language to another.

Why it matters for GEO

Models do not reason in words but in tokens, and three concrete consequences follow for AI visibility.

First, the context window — the amount of text a model can process at once — is measured in tokens. Content that is too dense or poorly structured risks being truncated.

Second, API cost and latency depend on the input/output token volume. This shapes how answer engines sample sources.

Finally, the way your page is segmented into tokens conditions how it is broken into passages, and therefore your AI citability.

A retenir

LLMs read, count, and bill in tokens, not words. Clear, well-segmented content is more efficient to process — and more likely to be cited.

A concrete example

The phrase "LUWIZ boosts AI visibility" is four words but may represent seven to nine tokens depending on the tokenizer, with the brand name "LUWIZ" often split into several pieces because it is absent from the vocabulary. This is one reason why strengthening a brand's named entity weighs so much in GEO.

FAQ

Frequently asked questions

A word is a linguistic unit; a token is a technical unit of the model. A common word often equals one token, but a long or rare term splits into several tokens. On average, in English, expect roughly 0.75 words per token, or about 1.3 tokens per word.

LLM APIs bill by the number of input and output tokens, not by word count. The model's context window is also measured in tokens. Reducing the verbosity of a prompt therefore directly lowers both cost and latency.

Go further