Anthropic launches Claude Sonnet 4.6 with automatic prompt caching and free code execution

Claude Sonnet 4.6 Launch

Anthropic announced Claude Sonnet 4.6 as its latest balanced model, designed to deliver improved agentic search performance while consuming fewer tokens than previous versions. Sonnet 4.6 supports extended thinking capabilities and a 1M token context window (currently in beta), matching the context capabilities of the flagship Opus model while maintaining faster inference speeds.

Automatic Prompt Caching

The Messages API now features automatic caching, eliminating the need for manual breakpoint management. Developers can add a single cache_control field to their request body, and the system automatically caches the last cacheable block, moving the cache point forward as conversations grow. This works alongside existing block-level cache control for fine-grained optimization and is available on the Claude API and Azure AI Foundry (preview).

Pricing & Tool Updates

Key pricing and feature changes include:

Free code execution when used with web search or web fetch tools
Web search tool and programmatic tool calling now generally available (exiting beta)
Dynamic filtering for web search and fetch, using code execution to filter results before reaching the context window
Fine-grained tool streaming now generally available across all models and platforms
Multiple tools graduating from beta: code execution tool, web fetch tool, tool search tool, tool use examples, and memory tool

Model Deprecations

Anthropic retired Claude Sonnet 3.7 and Claude Haiku 3.5, with all requests returning errors. Developers should migrate to Claude Sonnet 4.6 and Claude Haiku 4.5 respectively. Claude Haiku 3 deprecation was also announced, with retirement scheduled for April 19, 2026. Researchers can request ongoing access through the External Researcher Access Program.

Additional Capabilities

Opus 4.6 fast mode (research preview) delivers output generation up to 2.5x faster via the speed parameter
Compaction API (beta) provides server-side context summarization for effectively infinite conversations on Opus 4.6
Data residency controls allow developers to specify inference location via the inference_geo parameter, with US-only inference available at 1.1x pricing for models released after February 1, 2026
Structured outputs are now generally available on Claude Sonnet 4.5, Opus 4.5, and Haiku 4.5 with expanded schema support and simplified integration

Claude Sonnet 4.6 Launch

Automatic Prompt Caching

Pricing & Tool Updates

Model Deprecations

Additional Capabilities

Products

Tags

Published

Source

Related News