Claude Sonnet 4.6 Launch
Anthropic announced Claude Sonnet 4.6 as its latest balanced model, designed to deliver improved agentic search performance while consuming fewer tokens than previous versions. Sonnet 4.6 supports extended thinking capabilities and a 1M token context window (currently in beta), matching the context capabilities of the flagship Opus model while maintaining faster inference speeds.
Automatic Prompt Caching
The Messages API now features automatic caching, eliminating the need for manual breakpoint management. Developers can add a single cache_control field to their request body, and the system automatically caches the last cacheable block, moving the cache point forward as conversations grow. This works alongside existing block-level cache control for fine-grained optimization and is available on the Claude API and Azure AI Foundry (preview).
Pricing & Tool Updates
Key pricing and feature changes include:
- Free code execution when used with web search or web fetch tools
- Web search tool and programmatic tool calling now generally available (exiting beta)
- Dynamic filtering for web search and fetch, using code execution to filter results before reaching the context window
- Fine-grained tool streaming now generally available across all models and platforms
- Multiple tools graduating from beta: code execution tool, web fetch tool, tool search tool, tool use examples, and memory tool
Model Deprecations
Anthropic retired Claude Sonnet 3.7 and Claude Haiku 3.5, with all requests returning errors. Developers should migrate to Claude Sonnet 4.6 and Claude Haiku 4.5 respectively. Claude Haiku 3 deprecation was also announced, with retirement scheduled for April 19, 2026. Researchers can request ongoing access through the External Researcher Access Program.
Additional Capabilities
- Opus 4.6 fast mode (research preview) delivers output generation up to 2.5x faster via the
speedparameter - Compaction API (beta) provides server-side context summarization for effectively infinite conversations on Opus 4.6
- Data residency controls allow developers to specify inference location via the
inference_geoparameter, with US-only inference available at 1.1x pricing for models released after February 1, 2026 - Structured outputs are now generally available on Claude Sonnet 4.5, Opus 4.5, and Haiku 4.5 with expanded schema support and simplified integration