← Back
Google
Google upgrades Gemini 3 Deep Think; achieves 84.6% on ARC-AGI-2 and gold-medal math performance
Gemini · releasefeaturemodelapi · deepmind.google ↗

Gemini 3 Deep Think: Major Upgrade for Scientific Reasoning

Google has released a significant upgrade to Gemini 3 Deep Think, its specialized reasoning mode built to tackle complex challenges in science, research, and engineering. The new version was developed in close partnership with scientists and researchers to handle problems that lack clear guardrails, single correct solutions, or complete data—common characteristics of real-world research work.

Availability and Access

  • Gemini app: Now available to Google AI Ultra subscribers
  • Gemini API: Available for the first time through early access. Researchers, engineers, and enterprises can express interest via the official sign-up form.

Benchmark Performance

The upgraded Deep Think achieves impressive results across rigorous academic benchmarks:

  • 48.4% on Humanity's Last Exam (without tools)—a new standard for frontier models
  • 84.6% on ARC-AGI-2, verified by the ARC Prize Foundation
  • Elo rating of 3455 on Codeforces competitive programming challenges
  • Gold-medal level on International Math Olympiad 2025

Real-World Applications

Early testers have already demonstrated practical impact:

  • Mathematics research: A Rutgers mathematician used Deep Think to review a technical mathematics paper and identified a subtle logical flaw previously missed by human peer review.
  • Materials science: Duke University's Wang Lab optimized crystal growth fabrication methods, successfully designing recipes for thin films exceeding 100 μm—a target previous methods struggled to achieve.
  • Hardware design: Google engineers accelerated the design process for physical components using Deep Think's reasoning capabilities.

The model combines deep scientific knowledge with engineering utility, moving beyond abstract theory to enable practical applications across diverse domains.