Coding Agents

Claude Sonnet 4

Name: Claude Sonnet 4
Brand: Claude Sonnet 4
Availability: InStock

An advanced AI language model that delivers superior coding and reasoning capabilities with more precise instruction following. Offers both near-instant responses and extended thinking modes for deeper reasoning tasks.

Starting at$0

Visit Claude Sonnet 4 →

💡

In Plain English

Overview

Claude Sonnet 4 is a Language Model from Anthropic that delivers state-of-the-art coding and reasoning capabilities with hybrid instant and extended thinking modes, with pricing starting at $3 per million input tokens and $15 per million output tokens. It is built for developers, engineering teams, and enterprises that need a balanced, high-throughput model for production workloads.

Released in May 2025 as part of the Claude 4 family alongside Claude Opus 4, Sonnet 4 represents a significant upgrade over Claude Sonnet 3.7, scoring 72.7% on SWE-bench Verified — one of the highest publicly reported scores for a frontier coding model at launch. The model introduces hybrid reasoning, meaning users can toggle between near-instant responses for routine queries and an extended thinking mode that lets the model deliberate for longer on complex problems. It also gains the ability to use tools (such as web search) during extended thinking, alternating between reasoning steps and tool calls to improve answers. Anthropic has tightened instruction-following behavior, reduced reward-hacking shortcuts by 65% versus Sonnet 3.7 on agentic coding tasks, and added parallel tool use plus improved memory when developers grant file access.

In practice, Claude Sonnet 4 powers GitHub Copilot's new coding agent, drives agentic workflows in Cursor, Windsurf, and Replit, and is available through the Claude API, Amazon Bedrock, and Google Cloud Vertex AI. Compared to GPT-4.1 and Gemini 2.5 Pro, Sonnet 4 is positioned as the most coding-capable mid-tier model — cheaper than Opus 4 (which runs $15/$75 per million tokens) but markedly stronger than competing mid-tier models on agentic and software-engineering benchmarks. Based on our analysis of 870+ AI tools, Sonnet 4 sits in the top tier of language models for production coding agents, IDE integrations, and long-horizon agentic tasks where reliability and instruction adherence matter more than raw chat capability.

🎨

Vibe Coding Friendly?

▼

Difficulty:intermediate

Suitability for vibe coding depends on your experience level and the specific use case.

Learn about Vibe Coding →

Was this helpful?

Key Features

Hybrid reasoning with extended thinking+

Sonnet 4 can respond near-instantly for routine queries or switch into an extended thinking mode that allocates more compute for deliberation. Developers control this via a single API parameter, making it possible to escalate hard requests within a single agent loop without swapping models.

Tool use during reasoning+

Unlike traditional models that finish reasoning before invoking tools, Sonnet 4 can interleave tool calls (such as web search or code execution) inside its extended thinking process. This produces more grounded answers on research-style questions and reduces hallucinations on factual claims.

Agentic coding optimization+

Anthropic specifically tuned Sonnet 4 for long-running coding agents, achieving 72.7% on SWE-bench Verified and reducing reward-hacking shortcut behavior by 65% versus Sonnet 3.7. This is why GitHub selected it as the engine for Copilot's new coding agent and why it leads adoption inside Cursor, Windsurf, and Replit.

Parallel tool use and persistent memory+

When granted file system access, Sonnet 4 can write notes to disk to maintain context across long sessions and call multiple tools in parallel rather than sequentially. This dramatically improves throughput and consistency on multi-hour agentic tasks like full-project refactors or research syntheses.

Multi-cloud availability and pricing parity+

Sonnet 4 is available through the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI at the same $3/$15 per million token pricing. Combined with prompt caching (up to 90% off cached inputs) and batch processing (50% off async workloads), this gives enterprises flexibility on procurement, compliance, and cost optimization.

Pricing Plans

Free (Claude.ai)

✓Access to Claude Sonnet 4 with daily message limits
✓Web, iOS, and Android apps
✓File and image uploads
✓Basic conversation history

Pro

$20/month

✓5x more usage than Free
✓Access to Claude Opus 4 and extended thinking
✓Projects for organizing chats and files
✓Priority access during peak times
✓Early access to new features

Team

$25/user/month (annual) or $30/user/month billed monthly, min 5 seats

✓Everything in Pro with more usage
✓Central billing and admin controls
✓Collaboration features for teams
✓Minimum 5 seats

API

$3 / $15 per million input/output tokens

✓Pay-as-you-go API access
✓Up to 90% savings with prompt caching
✓50% discount with Batch API
✓Available on Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI
✓200K token context window

Enterprise

Custom

✓Expanded context and usage limits
✓SSO, SCIM, and audit logs
✓Data residency and compliance options
✓Dedicated support and onboarding

See Full Pricing →Free vs Paid →Is it worth it? →

Ready to get started with Claude Sonnet 4?

View Pricing Options →

Best Use Cases

🎯

Powering autonomous coding agents inside IDEs like Cursor, Windsurf, and GitHub Copilot, where reliable multi-step instruction following is critical

⚡

Building customer-facing chat products where you need a balance of low-latency responses and an optional 'deep think' escalation path

🔧

Long-horizon agentic workflows that span hours and require parallel tool use, memory across files, and consistent goal adherence

🚀

High-volume production workloads where Opus 4's $75/M output token cost is prohibitive but you still want frontier-grade coding ability

💡

Replacing Claude Sonnet 3.7 in existing pipelines as a drop-in upgrade at the same $3/$15 pricing with materially better benchmarks

🔄

Enterprise deployments through Amazon Bedrock or Google Cloud Vertex AI that require regional hosting, IAM integration, or BAA-style data agreements

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Claude Sonnet 4 doesn't handle well:

⚠Knowledge cutoff means it lacks awareness of events after its training date unless paired with web search tools
⚠Output token costs scale quickly with extended thinking mode — a long deliberation can 5–10x the per-request spend
⚠200K context window is generous but smaller than Gemini 2.5 Pro's 1M+, which matters for whole-monorepo analysis
⚠Not a fit for the very hardest agentic research tasks where Opus 4's additional capability measurably improves outcomes
⚠Free Claude.ai usage is rate-limited and cannot be relied on for sustained development work without a paid plan or API access

Pros & Cons

✓ Pros

✓Scores 72.7% on SWE-bench Verified, leading mid-tier coding benchmarks at launch
✓Hybrid reasoning lets you trade latency for depth on a per-request basis without switching models
✓Reduces shortcut/reward-hacking behavior by 65% compared to Claude Sonnet 3.7 on agentic coding tasks
✓Available through Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI with consistent pricing of $3/$15 per million input/output tokens
✓Free tier access through Claude.ai and integrations into GitHub Copilot, Cursor, Windsurf, and Replit
✓Parallel tool use and improved memory make it well-suited for long-horizon agents that span hours of work

✗ Cons

✗Falls short of Claude Opus 4 on the hardest reasoning and research-grade coding tasks
✗Output pricing of $15 per million tokens is higher than open-weight alternatives like DeepSeek or Llama-based hosts
✗Extended thinking mode can substantially increase latency and token costs if not carefully gated
✗200K context window is smaller than Gemini 2.5 Pro's 1M+ token context for very large codebases
✗Free Claude.ai usage has rate limits that make heavy iterative coding impractical without an API key or paid plan

Frequently Asked Questions

How much does Claude Sonnet 4 cost to use via the API?+

Claude Sonnet 4 is priced at $3 per million input tokens and $15 per million output tokens through the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI. Prompt caching can reduce input costs by up to 90% and batch processing offers a 50% discount for non-real-time workloads. This pricing is unchanged from Claude Sonnet 3.7, so the upgrade comes at no additional cost. For casual use, Claude.ai offers a free tier, with Pro ($20/month) and Team ($25/user/month billed annually, or $30/user/month billed monthly) plans for higher limits.

What's the difference between Claude Sonnet 4 and Claude Opus 4?+

Claude Opus 4 is Anthropic's flagship model designed for the most complex, long-running agentic tasks. It costs $15/$75 per million input/output tokens — five times more expensive than Sonnet 4 — and is built for problems where additional compute and capability materially improve outcomes. Sonnet 4 is the workhorse model: 72.7% SWE-bench Verified, identical hybrid reasoning capabilities, but optimized for high-volume production use at $3/$15 per million tokens. Most teams deploy Sonnet 4 for everyday coding agents and reserve Opus 4 for hard problems or research workflows where the extra capability justifies the cost premium.

How does Claude Sonnet 4 compare to GPT-4.1 and Gemini 2.5 Pro for coding?+

On SWE-bench Verified, Claude Sonnet 4 scores 72.7%, which is competitive with or ahead of GPT-4.1 and Gemini 2.5 Pro on most agentic coding benchmarks. Sonnet 4's strength is instruction-following and reduced reward-hacking on long-running coding tasks, which is why GitHub chose it to power Copilot's new coding agent. Gemini 2.5 Pro retains an advantage on extremely large contexts (1M+ tokens) and GPT-4.1 has stronger general-purpose chat polish, but for autonomous coding work Sonnet 4 is currently the most reliable mid-tier option. Based on our analysis of 870+ AI tools, it's the most-recommended model for IDE-integrated agents.

What is extended thinking mode and when should I use it?+

Extended thinking is a hybrid reasoning feature that lets Claude Sonnet 4 deliberate for longer before responding, optionally using tools like web search between reasoning steps. You enable it via an API parameter or toggle in Claude.ai. Use it for hard problems — multi-step debugging, math-heavy reasoning, complex refactors, or research tasks — where a few extra seconds of latency and additional token spend are worth a substantially better answer. For routine code completion or quick Q&A, the default near-instant mode is faster and cheaper.

Can Claude Sonnet 4 be used in production agents and IDEs?+

Yes — Claude Sonnet 4 powers GitHub Copilot's coding agent, Cursor's agent mode, Windsurf, Replit Agent, and dozens of other production developer tools. Anthropic has specifically tuned the model for long-horizon agentic workflows, with parallel tool use, improved memory when given file system access, and a 65% reduction in shortcut-taking behavior versus Sonnet 3.7. It is available via the Anthropic API as well as Amazon Bedrock and Google Cloud Vertex AI for enterprises with cloud-vendor preferences or compliance requirements.

🦞

New to AI tools?

Read practical guides for choosing and using AI tools

Read Guides →

Get updates on Claude Sonnet 4 and 370+ other AI tools

Weekly insights on the latest AI tools, features, and trends delivered to your inbox.

What's New in 2026

Claude Sonnet 4 launched in May 2025 as part of the Claude 4 family alongside Claude Opus 4, introducing hybrid extended thinking, tool use during reasoning, parallel tool calls, and a 65% reduction in reward-hacking shortcuts versus Sonnet 3.7. It scores 72.7% on SWE-bench Verified and powers GitHub Copilot's coding agent, Cursor, Windsurf, and Replit. As of April 2026, it remains widely deployed in production via the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI.

Alternatives to Claude Sonnet 4

Claude Opus 4.7

AI Agent Builders

Claude Opus 4.7 is a hybrid reasoning model for coding agents, enterprise AI workflows, long-context analysis, and complex multi-step tasks.

View All Alternatives & Detailed Comparison →

User Reviews

No reviews yet. Be the first to share your experience!

Quick Info

Try Claude Sonnet 4 Today

Get started with Claude Sonnet 4 and see if it's the right fit for your needs.

Get Started →

Need help choosing the right AI stack?

Take our 60-second quiz to get personalized tool recommendations

Find Your Perfect AI Stack →

Want a faster launch?

Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.

Browse Agent Templates →

More about Claude Sonnet 4

Pricing Review Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

📚 Related Articles

AI Coding Agents Compared: Claude Code vs Cursor vs Copilot vs Codex (2026)

Compare the top AI coding agents in 2026 — Claude Code, Cursor, Copilot, Codex, Windsurf, Aider, and more. Real pricing, honest strengths, and a decision framework for every skill level.

2026-03-1612 min read

Overview

Key Features

Hybrid reasoning with extended thinking+

Tool use during reasoning+

Agentic coding optimization+

Parallel tool use and persistent memory+

Multi-cloud availability and pricing parity+

Pricing Plans

Free (Claude.ai)

✓Access to Claude Sonnet 4 with daily message limits
✓Web, iOS, and Android apps
✓File and image uploads
✓Basic conversation history

Pro

$20/month

✓5x more usage than Free
✓Access to Claude Opus 4 and extended thinking
✓Projects for organizing chats and files
✓Priority access during peak times
✓Early access to new features

Team

$25/user/month (annual) or $30/user/month billed monthly, min 5 seats

✓Everything in Pro with more usage
✓Central billing and admin controls
✓Collaboration features for teams
✓Minimum 5 seats

API

$3 / $15 per million input/output tokens

✓Pay-as-you-go API access
✓Up to 90% savings with prompt caching
✓50% discount with Batch API
✓Available on Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI
✓200K token context window

Enterprise

Custom

✓Expanded context and usage limits
✓SSO, SCIM, and audit logs
✓Data residency and compliance options
✓Dedicated support and onboarding

Ready to get started with Claude Sonnet 4?

View Pricing Options →

Best Use Cases

🎯

Powering autonomous coding agents inside IDEs like Cursor, Windsurf, and GitHub Copilot, where reliable multi-step instruction following is critical

⚡

Building customer-facing chat products where you need a balance of low-latency responses and an optional 'deep think' escalation path

🔧

Long-horizon agentic workflows that span hours and require parallel tool use, memory across files, and consistent goal adherence

🚀

High-volume production workloads where Opus 4's $75/M output token cost is prohibitive but you still want frontier-grade coding ability

💡

Replacing Claude Sonnet 3.7 in existing pipelines as a drop-in upgrade at the same $3/$15 pricing with materially better benchmarks

🔄

Enterprise deployments through Amazon Bedrock or Google Cloud Vertex AI that require regional hosting, IAM integration, or BAA-style data agreements

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Claude Sonnet 4 doesn't handle well:

⚠Knowledge cutoff means it lacks awareness of events after its training date unless paired with web search tools

⚠Output token costs scale quickly with extended thinking mode — a long deliberation can 5–10x the per-request spend

⚠200K context window is generous but smaller than Gemini 2.5 Pro's 1M+, which matters for whole-monorepo analysis

⚠Not a fit for the very hardest agentic research tasks where Opus 4's additional capability measurably improves outcomes

⚠Free Claude.ai usage is rate-limited and cannot be relied on for sustained development work without a paid plan or API access

Pros & Cons

✓ Pros

✓Scores 72.7% on SWE-bench Verified, leading mid-tier coding benchmarks at launch
✓Hybrid reasoning lets you trade latency for depth on a per-request basis without switching models
✓Reduces shortcut/reward-hacking behavior by 65% compared to Claude Sonnet 3.7 on agentic coding tasks
✓Available through Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI with consistent pricing of $3/$15 per million input/output tokens
✓Free tier access through Claude.ai and integrations into GitHub Copilot, Cursor, Windsurf, and Replit
✓Parallel tool use and improved memory make it well-suited for long-horizon agents that span hours of work

✗ Cons

✗Falls short of Claude Opus 4 on the hardest reasoning and research-grade coding tasks
✗Output pricing of $15 per million tokens is higher than open-weight alternatives like DeepSeek or Llama-based hosts
✗Extended thinking mode can substantially increase latency and token costs if not carefully gated
✗200K context window is smaller than Gemini 2.5 Pro's 1M+ token context for very large codebases
✗Free Claude.ai usage has rate limits that make heavy iterative coding impractical without an API key or paid plan