OpenAI releases GPT-5.4 with native computer-use capabilities and 83% professional task performance

Availability and Core Capabilities

OpenAI has released GPT-5.4 across ChatGPT (as GPT-5.4 Thinking), the API, and Codex. A premium tier GPT-5.4 Pro is also available for users requiring maximum performance on complex tasks. This is the company's most capable frontier model to date, combining advances in reasoning, coding, and agentic workflows designed specifically for professional work.

Professional Knowledge Work

GPT-5.4 delivers significant improvements on real-world professional tasks. On GDPval—a benchmark spanning 44 occupations across the top 9 U.S. GDP-contributing industries—GPT-5.4 achieves 83.0% win rate against industry professionals, up from 70.9% for GPT-5.2. The model excels particularly on document-heavy tasks:

Spreadsheet modeling: 87.3% mean score on junior investment banking analyst tasks (vs. 68.4% for GPT-5.2)
Presentations: Human raters preferred GPT-5.4 output 68% of the time due to superior aesthetics and visual variety
Factual accuracy: Claims are 33% less likely to be false, and full responses contain 18% fewer errors than GPT-5.2

ChatGPT users can now see GPT-5.4 Thinking's reasoning plan upfront and adjust course mid-response, improving iteration efficiency. The model also enhances deep web research capabilities and maintains context better for complex, lengthy reasoning tasks.

Computer-Use Breakthrough

GPT-5.4 is the first general-purpose model from OpenAI with native computer-use capabilities, enabling agents to operate computers and carry out complex workflows across applications via screenshot analysis and mouse/keyboard commands. Key performance metrics:

OSWorld-Verified: 75.0% success rate on desktop navigation tasks (exceeding human performance at 72.4% and GPT-5.3-Codex at 74.0%)
SWE-Bench Pro: 57.7% on software engineering tasks
Toolathlon: 54.6% performance on multi-step tool interactions

Developers can steward model behavior through custom messages and configure safety policies to match different risk tolerances.

Technical Improvements and Developer Experience

The model supports 1M tokens of context, enabling agents to plan, execute, and verify tasks over extended horizons. It introduces tool search functionality, helping agents efficiently locate and use appropriate tools without sacrificing intelligence. Most importantly, GPT-5.4 is OpenAI's most token-efficient reasoning model yet, significantly reducing token usage and improving response speeds compared to GPT-5.2.

For developers, the API now includes updated spreadsheet and presentation skills in Codex, and Enterprise customers have access to a newly released ChatGPT for Excel add-in for enhanced spreadsheet collaboration.

Availability and Core Capabilities

Professional Knowledge Work

Computer-Use Breakthrough

Technical Improvements and Developer Experience

Products

Tags

Published

Source

Related News