Grok 4.20
Grok 4.20 is a Grok model from xAI, released in 2026-03. It costs $2 / $6 per 1M, has a 2M-token context window, and is best for reasoning, realtime-web-search, long-context. Last verified 2026-04-19.
Spec sheet
Pricing
- Input
- $2 / 1M
- Output
- $6 / 1M
- Cached input
- $0.5 / 1M
- Free tier
- No
Context & speed
- Context window
- 2M tokens
- Max output
- 32k tokens
- Throughput
- ~90 tok/s
- Time to first token
- ~700 ms
- Speed tier
- balanced
Capabilities
- Tool use
- Yes
- Structured output
- Yes
- Prompt caching
- Yes
- Extended thinking
- Yes
- Vision input
- Yes
- Audio in / out
- No
- Fine-tuning
- No
Deployment
- Open weights
- No
- On-prem
- No
- HIPAA eligible
- No
- Zero retention
- No
- Regions
- us
Estimated monthly cost
Assumes typical token shape: 2k input, 600 output per call. Prompt caching is excluded from these figures.
When to use Grok 4.20
Sweet spot
- reasoning
- realtime web search
- long context
- x data
Known trade-offs
- US-only regions
- younger SDK ecosystem
Works with
Compare Grok 4.20 to other models
FAQ — Grok 4.20
How much does Grok 4.20 cost?
Grok 4.20 costs $2 / $6 per 1M tokens on the xAI API. Cached input reads cost $0.5 per 1M, cutting the input bill by roughly 75% on repeat system prompts.
What is the context window of Grok 4.20?
Grok 4.20 has a 2M-token context window with up to 32k tokens of output. That's enough for entire codebases, long transcripts, or multi-document RAG.
Does Grok 4.20 have a free tier?
No — X Premium includes Grok web chat; API is paid.
Is Grok 4.20 HIPAA / EU / on-prem friendly?
Grok 4.20 is not HIPAA-eligible, not available in an EU region, and is API-only. Zero data retention is not available.
What is Grok 4.20 best for?
Grok 4.20 is best for reasoning, realtime web search, long context, x data. Trade-offs to be aware of: US-only regions; younger SDK ecosystem.
Which tools and SDKs work with Grok 4.20?
Grok 4.20 integrates with xAI SDK, OpenAI-compatible API, Cursor, Vercel AI SDK, OpenRouter. Most major AI frameworks support it either natively or through OpenAI-compatible endpoints.