Major GA Updates
1M Token Context Window Expands: Claude's extended context window is now generally available for Claude Opus 4.6 and Sonnet 4.6 at standard pricing. Requests exceeding 200k tokens work automatically without requiring a beta header. The window remains in beta for Sonnet 4.5 and Sonnet 4. Additionally, the media limit has been increased from 100 to 600 images or PDF pages per request when using the 1M token context.
Automatic Prompt Caching: The Messages API now supports automatic caching with a single cache_control field in your request body. The system intelligently caches the last cacheable block and moves the cache point forward as conversations grow—eliminating manual breakpoint management. This feature is available on the Claude API and Azure AI Foundry (preview) and works alongside existing block-level cache control for fine-grained optimization.
Pricing and Infrastructure Changes
- Unified rate limits: Dedicated 1M rate limits have been removed; standard account limits now apply across all context lengths
- Free code execution: API code execution is now free when used with web search or web fetch, improving token efficiency
- Data residency controls: New
inference_geoparameter allows US-only inference at 1.1x pricing for models released after February 1, 2026
Tool Availability and GA Promotions
The following tools and features have moved to general availability (no beta header required):
- Web search tool and programmatic tool calling
- Code execution tool, web fetch tool, and tool search tool
- Fine-grained tool streaming across all models and platforms
- Memory tool and tool use examples
Web search and web fetch now support dynamic filtering, which uses code execution to filter results before reaching the context window, reducing token consumption.
Model Updates and Deprecations
New Models: Claude Sonnet 4.6 combines improved speed and intelligence for everyday tasks with extended thinking support and agentic search improvements.
Retirements (effective now):
- Claude Sonnet 3.7 (
claude-3-7-sonnet-20250219) - Claude Haiku 3.5 (
claude-3-5-haiku-20241022)
Deprecation (retirement April 19, 2026):
- Claude Haiku 3 (
claude-3-haiku-20240307) — migrate to Claude Haiku 4.5
Researchers can request ongoing access through the External Researcher Access Program.
Additional Features
- Structured outputs are now GA on Claude Sonnet 4.5, Opus 4.5, and Haiku 4.5 with expanded schema support and improved grammar compilation
- Compaction API (beta) provides server-side context summarization for effectively infinite conversations on Opus 4.6
- Fast mode (research preview) on Opus 4.6 provides up to 2.5x faster output token generation at premium pricing