← Back
OpenAI
OpenAI releases GPT-5.4 with native computer-use capabilities and 83% win rate on knowledge work
OpenAI APIChatGPT · releasefeaturemodelapiperformance · openai.com ↗

Overview

OpenAI has released GPT-5.4, a new frontier model designed for professional work across ChatGPT, the API, and Codex. The model brings together advances in reasoning, coding, and agentic workflows, incorporating industry-leading coding capabilities while improving performance on knowledge work tasks like spreadsheets, presentations, and documents.

Knowledge Work Improvements

GPT-5.4 delivers substantial improvements on professional tasks:

  • GDPval benchmark: Achieves 83.0% win rate against industry professionals across 44 occupations, up from 70.9% for GPT-5.2
  • Spreadsheet modeling: Achieves 87.3% on internal investment banking analyst benchmarks, compared to 68.4% for GPT-5.2
  • Presentations: Human raters prefer GPT-5.4 outputs 68% of the time due to superior aesthetics and visual variety
  • Factuality: Individual claims are 33% less likely to be false; full responses are 18% less likely to contain errors

Computer Use and Agent Capabilities

GPT-5.4 is the first general-purpose OpenAI model with native computer-use capabilities, enabling agents to operate computers and complete complex workflows:

  • Achieves 75.0% success rate on OSWorld-Verified (desktop navigation via screenshots and keyboard/mouse), exceeding human performance at 72.4%
  • Supports up to 1M tokens of context for long-horizon task planning and execution
  • Introduces tool search to help agents find and use the right tools efficiently
  • Steerable via developer messages with configurable safety policies
  • Excellent at writing code via libraries like Playwright

Performance and Efficiency

GPT-5.4 is OpenAI's most token-efficient reasoning model yet, delivering faster speeds and reduced token costs compared to GPT-5.2. Additional benchmark improvements include:

  • SWE-Bench Pro: 57.7% (up from 55.6% for GPT-5.2)
  • Toolathlon: 54.6% (up from 46.3% for GPT-5.2)
  • BrowseComp: 82.7% (up from 65.8% for GPT-5.2)

Availability and Action Items

  • ChatGPT: GPT-5.4 Thinking and GPT-5.4 Pro now available
  • API: Full access to GPT-5.4 with computer-use capabilities and 1M token context window
  • Codex: Updated spreadsheet and presentation skills available
  • Excel integration: New ChatGPT for Excel add-in launched for Enterprise customers

Developers building agents and professionals using ChatGPT can immediately leverage these capabilities for complex workflow automation and knowledge work tasks.