AI Hub
All terms

Token

The atomic unit of text a model reads and generates — typically a word, sub-word, or character chunk.

Models do not see raw characters; a tokenizer splits text into tokens (e.g. "tokenization" might become "token" + "ization"). Each token maps to an ID the model embeds and processes. Pricing, context windows, and throughput are all measured in tokens.

As a rough rule of thumb, one token is about 4 characters or ¾ of a word in English.