Google Free tier

Gemini 3 Flash

Gemini 3 Flash is a Gemini 3 model from Google, released in 2026-02. It costs $0.50 / $3 per 1M, has a 1M-token context window, and is best for cheap-production, long-context-cheap, multimodal-cheap. Last verified 2026-04-19.

Spec sheet

Pricing

Input
$0.5 / 1M
Output
$3 / 1M
Cached input
$0.125 / 1M
Batch discount
50%
Free tier
Google AI Studio

Context & speed

Context window
1M tokens
Max output
66k tokens
Throughput
~200 tok/s
Time to first token
~320 ms
Speed tier
fast

Capabilities

Tool use
Yes
Structured output
Yes
Prompt caching
Yes
Extended thinking
Yes
Vision input
Yes
Audio in / out
Yes
Fine-tuning
Yes

Deployment

Open weights
No
On-prem
No
HIPAA eligible
Yes
Zero retention
Yes
Regions
us, eu, apac

Estimated monthly cost

Assumes typical token shape: 2k input, 600 output per call. Prompt caching is excluded from these figures.

10k calls/mo
$28.00
per month
100k calls/mo
$280.00
per month
1M calls/mo
$2.8k
per month

When to use Gemini 3 Flash

Sweet spot

  • cheap production
  • long context cheap
  • multimodal cheap
  • maps

Known trade-offs

  • below Pro on hardest tasks

Works with

Google AI SDKVertex AIVercel AI SDKLangChainLlamaIndexOpenRouter

FAQ — Gemini 3 Flash

How much does Gemini 3 Flash cost?

Gemini 3 Flash costs $0.50 / $3 per 1M tokens on the Google API. Cached input reads cost $0.125 per 1M, cutting the input bill by roughly 75% on repeat system prompts. The batch API offers a 50% discount for async workloads.

What is the context window of Gemini 3 Flash?

Gemini 3 Flash has a 1M-token context window with up to 66k tokens of output. That's enough for entire codebases, long transcripts, or multi-document RAG.

Does Gemini 3 Flash have a free tier?

Yes — Reduced daily quota; rate-limited for prototyping. Start at https://aistudio.google.com.

Is Gemini 3 Flash HIPAA / EU / on-prem friendly?

Gemini 3 Flash is HIPAA-eligible, available in EU regions, and is API-only. Zero data retention is available for enterprise customers.

What is Gemini 3 Flash best for?

Gemini 3 Flash is best for cheap production, long context cheap, multimodal cheap, maps. Trade-offs to be aware of: below Pro on hardest tasks.

Which tools and SDKs work with Gemini 3 Flash?

Gemini 3 Flash integrates with Google AI SDK, Vertex AI, Vercel AI SDK, LangChain, LlamaIndex, OpenRouter. Most major AI frameworks support it either natively or through OpenAI-compatible endpoints.