Models API Now Exposes Capability Metadata
The Claude Platform's Models API now includes detailed capability fields, allowing developers to programmatically discover model specifications. The GET /v1/models and GET /v1/models/{model_id} endpoints now return max_input_tokens, max_tokens, and a capabilities object. This enables applications to automatically detect and adapt to model capabilities without manual configuration.
1M Token Context Window Now Generally Available
The highly-requested 1M token context window is now generally available for Claude Opus 4.6 and Sonnet 4.6 at standard pricing. Requests exceeding 200k tokens work automatically without requiring beta headers. This represents a significant step in making ultra-long context capabilities accessible to all developers, removing the beta status that previously limited adoption.
Key Changes:
- Standard pricing for 1M context window on Opus 4.6 and Sonnet 4.6
- No beta header required for these models
- 1M token context remains in beta for Sonnet 4.5 and Sonnet 4
- Media limit increase: 600 images or PDF pages per request (up from 100)
- Dedicated 1M rate limits removed; standard account limits now apply across all context lengths
Extended Thinking Display Control
Developers can now control thinking content visibility in responses using the new display field for extended thinking. Setting thinking.display: "omitted" returns thinking blocks with empty content while preserving the signature field for multi-turn continuity. This enables faster streaming without affecting billing or context management.
Automatic Prompt Caching Launched
The Messages API now supports automatic caching with a single cache_control field. The system automatically caches the last cacheable block and moves the cache point forward as conversations grow, eliminating manual breakpoint management. This feature works alongside existing block-level cache control for fine-grained optimization and is available on Claude API and Azure AI Foundry (preview).