Claude Platform adds model capability discovery, 1M token context GA, and automatic prompt caching

Model Discovery and API Improvements

The Models API now includes model capability fields to help developers understand each model's capabilities at a glance. Requests to GET /v1/models and GET /v1/models/{model_id} now return max_input_tokens, max_tokens, and a capabilities object, enabling programmatic discovery without manual documentation lookups.

1M Token Context Window Now Generally Available

The 1M token context window is now available at standard pricing for Claude Opus 4.6 and Sonnet 4.6, with no beta header required. Requests over 200k tokens work automatically for these models. The feature remains in beta for Claude Sonnet 4.5 and Sonnet 4. Additionally, Anthropic removed dedicated 1M rate limits and unified all context lengths under standard account limits. The media limit per request was raised from 100 to 600 images or PDF pages when using the 1M token context window.

Automatic Prompt Caching and Extended Thinking

Automatic caching for the Messages API eliminates manual breakpoint management. Add a single cache_control field to your request body and the system automatically caches the last cacheable block, moving the cache point forward as conversations grow. Works alongside existing block-level cache control for fine-grained optimization.

Extended thinking now supports a display field, allowing developers to omit thinking content from responses for faster streaming. Set thinking.display: "omitted" to receive thinking blocks with an empty thinking field while preserving the signature for multi-turn continuity. Billing remains unchanged.

New Model Releases and Deprecations

Claude Opus 4.6 and Claude Sonnet 4.6 were launched as the latest models, with Opus 4.6 recommended for complex agentic tasks. Key deprecations include retirement of Claude Sonnet 3.7 and Claude Haiku 3.5, with Claude Haiku 3 scheduled for retirement on April 19, 2026. Developers should migrate to the latest model versions.

Agent Tools and Extended Features

Multiple agent tools are now generally available without beta headers: web search tool, programmatic tool calling, code execution tool, web fetch tool, tool search tool, tool use examples, and memory tool. Code execution is now free when used with web search or web fetch. Web search and web fetch support dynamic filtering, using code execution to filter results before they reach the context window.

Additional features launched include the compaction API (in beta) for server-side context summarization, data residency controls with the inference_geo parameter (1.1x pricing for US-only inference), adaptive thinking for Opus 4.6, and fine-grained tool streaming now GA across all models and platforms.

Model Discovery and API Improvements

1M Token Context Window Now Generally Available

Automatic Prompt Caching and Extended Thinking

New Model Releases and Deprecations

Agent Tools and Extended Features

Products

Tags

Published

Source

Related News