Token Counter
Count tokens for GPT-4, Claude, Gemini, and Llama.
Token count is approximate (GPT BPE-style). Actual counts may vary ±10–15% per model.
How the Token Counter works
The token counter calculates the number of tokens in any text using the same tokenisation as OpenAI's GPT-4, GPT-3.5, and Claude models. Paste your prompt, system instructions, or document — the tool shows the token count, estimated API cost at current pricing, and what percentage of each model's context window is consumed. Essential for prompt engineers and developers building LLM-powered applications.
What is a token?
A token is not a word — it is a chunk of text as defined by the model's tokeniser. Common English words are typically 1 token (e.g., "hello" = 1 token), but longer or uncommon words split into multiple tokens ("cryptocurrency" = 3 tokens). Spaces, punctuation, and line breaks also consume tokens. On average, 1 token ≈ 4 characters or ¾ of a word in English. Non-English languages, code, and numbers often use more tokens per character.
Context window limits by model
Different models have different context window sizes: GPT-3.5-turbo supports 16,385 tokens, GPT-4o supports 128,000 tokens, GPT-4-32k supports 32,768, and Claude 3 Opus supports 200,000 tokens. The context window includes both the input (prompt + conversation history) and the output (generated response). The token counter shows your input's percentage of the selected model's context limit so you can stay within bounds.
Cost estimation per 1M tokens
OpenAI charges per million tokens: GPT-4o input costs $5.00/M tokens and output $15.00/M. GPT-3.5-turbo input is $0.50/M. Anthropic's Claude 3 Sonnet charges $3.00/M input. At these rates, a 10,000-token prompt processed 100 times per day costs roughly $5/day on GPT-4o input — the token counter makes this arithmetic instant, enabling accurate cost projection before committing to a production architecture.
Tiktoken cl100k_base tokenisation
GPT-4 and GPT-3.5-turbo use the cl100k_base tokeniser from OpenAI's tiktoken library. Claude uses a different tokeniser internally, but the token counts are similar enough for estimation purposes. The counter uses a JavaScript port of tiktoken running entirely in the browser — no API calls are made, your text is never uploaded, and token counting works offline. Developers can replicate the count in Python with tiktoken.get_encoding("cl100k_base").encode(text).
Frequently asked questions
- What is a token in AI models?
- Tokens are the basic units that language models process — roughly 4 characters or 0.75 words in English. Common words are single tokens; rare words, code, and non-English text often require more tokens per word. Models are billed per token.
- Why does token count vary by model?
- Different models use different tokenisation algorithms (BPE, SentencePiece, etc.) trained on different vocabularies. GPT-4 uses cl100k_base, Claude uses its own tokeniser. The same text can have different token counts across models.
- What is a context window?
- The context window is the maximum number of tokens a model can process in a single request (input + output combined). GPT-4o h tokens, Claude 3.5 Sonnet h, and Gemini 1.5 Pro h. Inputs exceeding the context window are truncated or rejected.
- How accurate is this counter?
- The token count is approximate (±10–15%) because exact tokenisation requires the model's specific tokeniser library (tiktoken for OpenAI, etc.). For precise counts, use the OpenAI Tokenizer (platform.openai.com/tokenizer) or the tiktoken Python library.
- How is API cost calculated?
- API cost = (token count / 1,000,000) × price per million tokens. Input and output tokens are priced differently — outputs are typically 3–4× more expensive than inputs. Costs shown are for input tokens only at current list prices.