Will your prompt fit? Check context windows across 20+ frontier LLMs.

Paste text or drop a file. We'll show you which models can accept it and what each one costs per call.

Frequently asked

What is a context window?
A context window is the maximum number of tokens (roughly characters or word-pieces) a language model can read in a single request. Inputs that exceed it get truncated or rejected.
Tokens vs words — what's the difference?
A token is the unit each model actually reads. English averages around 0.75 tokens per word, so 1,000 words is typically 1,200–1,400 tokens — but this varies by tokenizer. Code and non-English text use more tokens per character.
Why does Claude count tokens differently from GPT?
Each provider trains its own tokenizer with its own vocabulary. The same string can produce different token counts on different models. We use OpenAI's exact tokenizer for OpenAI models and calibrated character-ratio estimates for the rest.
Should I leave room for the response?
Yes. The model writes its reply into the same context window. Reserve at least the maximum tokens you want it to generate (4k is a reasonable default for chat; 16k+ for long structured output).

All frontier model context windows

Static reference table. The interactive checker above uses the same data.

ModelProviderWindow (tokens)Max output
Gemini 3.1 Progoogle2,000,00065,536
GPT-5.5openai1,050,000128,000
Claude Opus 4.7anthropic1,000,00064,000
Claude Sonnet 4.6anthropic1,000,00064,000
Gemini 3 Flashgoogle1,000,00065,536
Gemini 3.1 Flash-Litegoogle1,000,0008,192
DeepSeek V4 Flashdeepseek1,000,000384,000
GPT-5.4openai400,00032,000
GPT-5.4 Proopenai400,00064,000
GPT-5.4 Miniopenai400,00016,384
Qwen3-Maxalibaba262,14432,768
Grok 4.3xai256,00032,000
Codestral 25.08mistral256,0008,192
Claude Haiku 4.5anthropic200,0008,192
GLM-5.1other200,000131,072
GPT Realtimeopenai128,0004,096