Anthropic releases Claude Opus 4.6 with 1M token context, outperforms GPT-5.2 by 144 Elo points

Claude Opus 4.6 Now Available

Anthropic has released Claude Opus 4.6, a substantial upgrade to its flagship model with significant improvements in reasoning, coding capabilities, and agentic task execution. The new model is immediately available on claude.ai, via the Claude API (using claude-opus-4-6), and across major cloud platforms.

Key Capabilities and Improvements

Coding and Agentic Performance:

Achieves the highest score on Terminal-Bench 2.0, an evaluation of agentic coding and system tasks
Improved code review and debugging skills with better ability to catch its own mistakes
Can operate more reliably in larger codebases
Sustains agentic tasks for longer without degradation

Context Window:

Features a 1M token context window in beta — a first for Opus-class models, enabling work with significantly larger documents and codebases

Extended Thinking and Control:

Introduces adaptive thinking, where the model contextually determines how much to use extended reasoning
New effort controls give developers granular control over intelligence, speed, and cost tradeoffs
Developers can dial effort from "high" (default) to "medium" for simpler tasks to reduce latency and cost

Benchmark Performance

Opus 4.6 demonstrates state-of-the-art performance across multiple evaluations:

GDPval-AA: Outperforms OpenAI's GPT-5.2 by ~144 Elo points and its predecessor (Opus 4.5) by 190 points on economically valuable knowledge work tasks in finance, legal, and other domains
Humanity's Last Exam: Leads all frontier models on this complex multidisciplinary reasoning test
BrowseComp: Performs better than any other model on online information retrieval tasks
Safety Profile: Maintains competitive safety metrics with low rates of misaligned behavior across evaluations

New Product Features

Claude Code Enhancements:

Agent teams feature allows assembly of collaborative agents to work on tasks together

API Features:

Compaction enables the model to summarize its own context, allowing longer-running tasks without hitting token limits
New context management tools for building extended workflows

Office Suite Integration:

Substantial upgrades to Claude in Excel
Claude in PowerPoint now available in research preview for enhanced presentation capabilities

Pricing and Availability

Pricing remains unchanged at $5 per million input tokens and $25 per million output tokens across all access channels. Early Access partners including Notion, GitHub, Replit, Asana, and Cognition have reported the model excels at complex multi-step tasks and agentic workflows that previously required significant hand-holding.

Claude Opus 4.6 Now Available

Key Capabilities and Improvements

Benchmark Performance

New Product Features

Pricing and Availability

Products

Tags

Published

Source

Related News