Google Free tier

Gemini 3.5 Flash

Gemini 3.5 Flash is a Gemini 3 model from Google, released in 2026-05. It costs $1.5 / $9 per 1M, has a 1.0M-token context window, and is best for agents, coding, long-context. Last verified 2026-05-22.

Spec sheet

Pricing

Input
$1.5 / 1M
Output
$9 / 1M
Cached input
$0.15 / 1M
Batch discount
50%
Free tier
Google AI Studio

Context & speed

Context window
1.0M tokens
Max output
66k tokens
Throughput
~220 tok/s
Time to first token
~300 ms
Speed tier
fast

Capabilities

Tool use
Yes
Structured output
Yes
Prompt caching
Yes
Extended thinking
Yes
Vision input
Yes
Audio in / out
Yes
Fine-tuning
Yes

Deployment

Open weights
No
On-prem
No
HIPAA eligible
Yes
Zero retention
Yes
Regions
us, eu, apac

Estimated monthly cost

Assumes typical token shape: 2k input, 600 output per call. Prompt caching is excluded from these figures.

10k calls/mo
$84.00
per month
100k calls/mo
$840.00
per month
1M calls/mo
$8.4k
per month

When to use Gemini 3.5 Flash

Sweet spot

  • agents
  • coding
  • long context
  • multimodal
  • general production

Known trade-offs

  • 3x more expensive than Gemini 3 Flash
  • non-global regions priced 10% higher at $1.65/$9.90

Works with

Google AI SDKVertex AIGitHub CopilotVercel AI SDKLangChainLlamaIndexOpenRouter

FAQ — Gemini 3.5 Flash

How much does Gemini 3.5 Flash cost?

Gemini 3.5 Flash costs $1.5 / $9 per 1M tokens on the Google API. Cached input reads cost $0.15 per 1M, cutting the input bill by roughly 90% on repeat system prompts. The batch API offers a 50% discount for async workloads.

What is the context window of Gemini 3.5 Flash?

Gemini 3.5 Flash has a 1.0M-token context window with up to 66k tokens of output. That's enough for entire codebases, long transcripts, or multi-document RAG.

Does Gemini 3.5 Flash have a free tier?

Yes — Reduced daily quota via AI Studio for prototyping. Start at https://aistudio.google.com.

Is Gemini 3.5 Flash HIPAA / EU / on-prem friendly?

Gemini 3.5 Flash is HIPAA-eligible, available in EU regions, and is API-only. Zero data retention is available for enterprise customers.

What is Gemini 3.5 Flash best for?

Gemini 3.5 Flash is best for agents, coding, long context, multimodal. Trade-offs to be aware of: 3x more expensive than Gemini 3 Flash; non-global regions priced 10% higher at $1.65/$9.90.

Which tools and SDKs work with Gemini 3.5 Flash?

Gemini 3.5 Flash integrates with Google AI SDK, Vertex AI, GitHub Copilot, Vercel AI SDK, LangChain, LlamaIndex, OpenRouter. Most major AI frameworks support it either natively or through OpenAI-compatible endpoints.