Language Model

Grok 4.20 0309 v2

Name: Grok 4.20 0309 v2
Brand: Grok 4.20 0309 v2
Price: 3 USD
Availability: InStock

A high-performance reasoning language model from xAI, listed on Artificial Analysis, that supports text and image input with a 2M token context window. Notable for fast inference speed and strong intelligence ranking among comparable models.

Starting at$3.00 per million tokens

Visit Grok 4.20 0309 v2 →

Overview

Grok 4.20 0309 v2, as listed on Artificial Analysis, is a Language Model reasoning system from xAI that delivers high-intelligence text and image understanding with a 2M token context window, with pricing available on a paid per-token basis through xAI's first-party API. It targets developers, AI engineers, and enterprises building reasoning-heavy applications such as code generation, scientific analysis, and long-document comprehension.

On Artificial Analysis, Grok 4.20 0309 v2 is benchmarked alongside hundreds of tracked models, where it ranks among xAI's top-tier reasoning offerings competing with systems from OpenAI, Anthropic, Google, DeepSeek, and Alibaba. The model is evaluated on the Artificial Analysis Intelligence Index v4.0, which aggregates 10 demanding benchmarks including GDPval-AA, τ²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, and CritPt. Its key differentiators are a 2M-token context window — substantially larger than the context windows offered by most competing flagship reasoning models — and fast output speed measured in tokens per second after the first streaming chunk is received.

Compared to other reasoning language models in our directory, Grok 4.20 0309 v2 emphasizes the price-quality frontier: Artificial Analysis charts it on the "most attractive quadrant" of the intelligence-vs-price log scale, where price is computed as a 3:1 blend of input and output token costs. At approximately $3.00 per million input tokens and $15.00 per million output tokens, with a discounted cached input rate of $0.75 per million tokens, it is competitively positioned for high-volume reasoning workloads. The model supports both text and image inputs, making it suitable for multimodal reasoning tasks such as chart interpretation, diagram analysis, and visual question answering. For teams already using xAI's ecosystem or building on top of X (formerly Twitter) data integrations, Grok 4.20 0309 v2 offers a tightly integrated alternative to OpenAI's o-series and Anthropic's Claude reasoning lines, with the trade-off being a smaller third-party provider ecosystem and less mature tooling.

🎨

Vibe Coding Friendly?

▼

Difficulty:intermediate

Suitability for vibe coding depends on your experience level and the specific use case.

Learn about Vibe Coding →

Was this helpful?

Key Features

2M Token Context Window+

Accepts up to 2 million tokens per request, substantially larger than the 128K–200K context windows typical of competing flagship reasoning models. This enables full-repository code analysis, multi-document synthesis, and very long agent histories without chunking or external retrieval pipelines.

Multimodal Text + Image Input+

Processes text and image inputs natively in the same request, with image input priced at approximately $5.25/M tokens. Useful for chart interpretation, screenshot debugging, and document understanding workflows that mix prose and visuals.

Reasoning-Optimized Architecture+

Marked with the lightbulb (reasoning) indicator on Artificial Analysis, meaning the model performs internal chain-of-thought before producing user-visible output. This boosts performance on benchmarks like GPQA Diamond and Humanity's Last Exam at the cost of higher first-token latency.

Cached Input Pricing Tier+

Offers a discounted rate of approximately $0.75/M tokens for cached input, compared to $3.00/M for standard input — a 75% reduction that materially lowers costs when sending the same long system prompt or document repeatedly across many queries.

Transparent Benchmark Reporting via Artificial Analysis+

Continuously evaluated on Intelligence Index v4.0, which aggregates 10 benchmarks (GDPval-AA, τ²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, CritPt). This third-party evaluation provides ongoing, public quality and speed measurements rather than vendor-only claims.

Pricing Plans

Input Tokens

$3.00 per million tokens

✓Standard text input processing
✓Up to 2M token context window

Cached Input Tokens

$0.75 per million tokens

✓75% discount vs standard input
✓Ideal for repeated system prompts and long documents

Output Tokens

$15.00 per million tokens

✓Streaming text output
✓Includes reasoning chain-of-thought generation

Image Input Tokens

$5.25 per million tokens

✓Native image understanding
✓Charts, diagrams, screenshots, and documents

See Full Pricing →Free vs Paid →Is it worth it? →

Ready to get started with Grok 4.20 0309 v2?

View Pricing Options →

Best Use Cases

🎯

Whole-codebase analysis and refactoring where the full repository (up to 2M tokens) needs to fit in a single prompt without retrieval

⚡

Long-document review for legal contracts, financial filings, or research papers requiring cross-section reasoning

🔧

Multimodal scientific reasoning combining diagrams, charts, and prose in a single request — for example interpreting experimental figures alongside methodology text

🚀

Latency-sensitive agentic applications where fast streaming output keeps interactive UIs responsive during chain-of-thought

💡

Cost-optimized batch reasoning workloads using the cached input pricing tier ($0.75/M tokens) for prompts with large repeated system contexts

🔄

Benchmark-driven model selection for teams who want transparent third-party evaluation via Artificial Analysis Intelligence Index v4.0

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Grok 4.20 0309 v2 doesn't handle well:

⚠No image generation — multimodality is input-only, output is text
⚠Per-token billing only; no flat subscription means costs scale with usage and can be hard to predict for high-volume apps
⚠Reasoning models incur higher time-to-first-token latency than non-reasoning equivalents
⚠Smaller ecosystem of third-party hosting providers compared to OpenAI/Anthropic models, reducing redundancy options
⚠Performance figures shown on Artificial Analysis represent first-party API or median across providers; actual results may vary by route and region

Pros & Cons

✓ Pros

✓2M token context window is substantially larger than most competing reasoning models, enabling whole-codebase or whole-book analysis
✓Multimodal support accepts both text and image inputs in a single request
✓Positioned in the 'most attractive quadrant' of price-vs-intelligence on the Artificial Analysis chart, indicating strong value relative to peers
✓Fast output speed measured in tokens-per-second sustained after first chunk, suitable for latency-sensitive streaming UIs
✓Evaluated against 10 rigorous benchmarks including Humanity's Last Exam, GPQA Diamond, and SciCode for transparent quality reporting
✓Cached input pricing at ~$0.75/M tokens reduces costs for repeated long-context prompts by roughly 75% versus standard input rates

✗ Cons

✗Pricing is per-token only — no flat-rate or subscription tier for individual users
✗Smaller third-party provider ecosystem compared to OpenAI or Anthropic, limiting failover and routing options
✗As a reasoning model, latency to first token can be higher than non-reasoning peers due to internal chain-of-thought
✗Documentation and SDK maturity lag behind GPT and Claude, requiring more integration work
✗Output speed and price metrics rely on first-party API median; real-world variance across providers can be significant

Frequently Asked Questions

How does Grok 4.20 0309 v2's 2M token context window compare to other reasoning models?+

The 2M token context is substantially larger than the context windows offered by most competing flagship reasoning models, which typically range from 128K to 200K tokens. This allows you to feed entire codebases, multi-volume documents, or extended conversation histories without chunking or retrieval-augmented workarounds. For long-context tasks like legal document review or full-repo refactoring, this is a meaningful advantage. However, retrieval quality at the upper end of any large context window varies, so empirical testing on your specific use case is recommended before committing.

How is Grok 4.20 0309 v2 priced?+

Pricing is per-million-tokens: approximately $3.00/M for input tokens, $15.00/M for output tokens, $0.75/M for cached input tokens, and $5.25/M for image input tokens. The Artificial Analysis 'Price' metric blends input and output at a 3:1 ratio for fair cross-model comparison. There is no free consumer tier listed for direct API access; usage is metered and billed against an xAI account. For the latest rates, check xAI's API pricing page at x.ai or the live pricing comparison on Artificial Analysis, as per-token pricing updates periodically.

What benchmarks is Grok 4.20 0309 v2 evaluated on?+

Artificial Analysis tracks it on the Intelligence Index v4.0, which aggregates 10 evaluations: GDPval-AA, τ²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, and CritPt. These cover scientific reasoning, code execution, long-context retrieval, instruction following, and graduate-level domain knowledge. The composite index is designed to resist gaming by any single benchmark and provides a holistic view of model capability. Individual benchmark scores are also published for fine-grained comparison.

Can Grok 4.20 0309 v2 handle image inputs?+

Yes — it supports both text and image inputs natively, making it a multimodal reasoning model rather than text-only. This enables use cases like chart interpretation, screenshot debugging, document OCR with reasoning, and visual question answering in a single API call. Image input is priced at approximately $5.25 per million tokens, separate from text token rates. Output is text-only; the model does not generate images.

How does output speed compare to other reasoning models?+

Artificial Analysis measures output speed as tokens-per-second sustained after the first streaming chunk arrives, and tracks both median speed and variance over time. Grok 4.20 0309 v2 is highlighted for fast inference among comparable reasoning models, though absolute numbers vary by provider and load. Reasoning models typically have higher time-to-first-token than non-reasoning peers because they generate internal chain-of-thought before user-visible output. Check the Output Speed and Output Speed Over Time charts on Artificial Analysis for current measurements.

🦞

New to AI tools?

Learn how to run your first agent with OpenClaw

Learn OpenClaw →

Get updates on Grok 4.20 0309 v2 and 370+ other AI tools

Weekly insights on the latest AI tools, features, and trends delivered to your inbox.

What's New in 2026

The v2 revision (0309) released in March 2026 brought updated benchmark evaluations on the Artificial Analysis Intelligence Index v4.0, continued competitive positioning on the price-vs-intelligence frontier, and ongoing output speed tracking. xAI's API pricing for this model variant reflects current market rates as of April 2026, with cached input pricing offering a notable cost advantage for high-volume long-context workloads.

User Reviews

No reviews yet. Be the first to share your experience!

Quick Info

Try Grok 4.20 0309 v2 Today

Get started with Grok 4.20 0309 v2 and see if it's the right fit for your needs.

Get Started →

Need help choosing the right AI stack?

Take our 60-second quiz to get personalized tool recommendations

Find Your Perfect AI Stack →

Want a faster launch?

Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.

Browse Agent Templates →

More about Grok 4.20 0309 v2

Pricing Review Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

Overview

Key Features

2M Token Context Window+

Multimodal Text + Image Input+

Reasoning-Optimized Architecture+

Cached Input Pricing Tier+

Transparent Benchmark Reporting via Artificial Analysis+

Pricing Plans

Input Tokens

$3.00 per million tokens

✓Standard text input processing
✓Up to 2M token context window

Cached Input Tokens

$0.75 per million tokens

✓75% discount vs standard input
✓Ideal for repeated system prompts and long documents

Output Tokens

$15.00 per million tokens

✓Streaming text output
✓Includes reasoning chain-of-thought generation

Image Input Tokens

$5.25 per million tokens

✓Native image understanding
✓Charts, diagrams, screenshots, and documents

Ready to get started with Grok 4.20 0309 v2?

View Pricing Options →

Best Use Cases

🎯

Whole-codebase analysis and refactoring where the full repository (up to 2M tokens) needs to fit in a single prompt without retrieval

⚡

Long-document review for legal contracts, financial filings, or research papers requiring cross-section reasoning

🔧

Multimodal scientific reasoning combining diagrams, charts, and prose in a single request — for example interpreting experimental figures alongside methodology text

🚀

Latency-sensitive agentic applications where fast streaming output keeps interactive UIs responsive during chain-of-thought

💡

Cost-optimized batch reasoning workloads using the cached input pricing tier ($0.75/M tokens) for prompts with large repeated system contexts

🔄

Benchmark-driven model selection for teams who want transparent third-party evaluation via Artificial Analysis Intelligence Index v4.0

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Grok 4.20 0309 v2 doesn't handle well:

⚠No image generation — multimodality is input-only, output is text

⚠Per-token billing only; no flat subscription means costs scale with usage and can be hard to predict for high-volume apps

⚠Reasoning models incur higher time-to-first-token latency than non-reasoning equivalents

⚠Smaller ecosystem of third-party hosting providers compared to OpenAI/Anthropic models, reducing redundancy options

⚠Performance figures shown on Artificial Analysis represent first-party API or median across providers; actual results may vary by route and region

Pros & Cons

✓ Pros

✓2M token context window is substantially larger than most competing reasoning models, enabling whole-codebase or whole-book analysis
✓Multimodal support accepts both text and image inputs in a single request
✓Positioned in the 'most attractive quadrant' of price-vs-intelligence on the Artificial Analysis chart, indicating strong value relative to peers
✓Fast output speed measured in tokens-per-second sustained after first chunk, suitable for latency-sensitive streaming UIs
✓Evaluated against 10 rigorous benchmarks including Humanity's Last Exam, GPQA Diamond, and SciCode for transparent quality reporting
✓Cached input pricing at ~$0.75/M tokens reduces costs for repeated long-context prompts by roughly 75% versus standard input rates

✗ Cons

✗Pricing is per-token only — no flat-rate or subscription tier for individual users
✗Smaller third-party provider ecosystem compared to OpenAI or Anthropic, limiting failover and routing options
✗As a reasoning model, latency to first token can be higher than non-reasoning peers due to internal chain-of-thought
✗Documentation and SDK maturity lag behind GPT and Claude, requiring more integration work
✗Output speed and price metrics rely on first-party API median; real-world variance across providers can be significant

Frequently Asked Questions

How does Grok 4.20 0309 v2's 2M token context window compare to other reasoning models?+

How is Grok 4.20 0309 v2 priced?+

What benchmarks is Grok 4.20 0309 v2 evaluated on?+

Can Grok 4.20 0309 v2 handle image inputs?+

How does output speed compare to other reasoning models?+

What's New in 2026