OpenAI releases GPT-5.4, matching industry professionals on 83% of knowledge work tasks

GPT-5.4: A New Frontier Model for Professional Work

OpenAI has released GPT-5.4, its most capable and efficient frontier model designed specifically for professional workflows. The model is available across ChatGPT (as GPT-5.4 Thinking and GPT-5.4 Pro), the OpenAI API, and Codex. GPT-5.4 combines recent advances in reasoning, coding, and agentic workflows into a single model that handles complex real-world tasks with improved accuracy and efficiency.

Key Capabilities and Improvements

Knowledge Work Performance: GPT-5.4 achieves 83.0% win rate against industry professionals on the GDPval benchmark—a significant jump from GPT-5.2's 70.9%. The model shows dramatic improvements in specific professional domains:

Spreadsheet modeling: 87.3% accuracy on investment banking tasks (vs. 68.4% for GPT-5.2)
Legal document analysis: 91% score on BigLaw Bench with superior contract analysis and transactional accuracy
Presentation generation: Human raters preferred GPT-5.4 outputs 68% of the time for visual quality and design

Computer Use: GPT-5.4 is the first general-purpose OpenAI model with native computer-use capabilities, enabling agents to operate computers and complete complex workflows across applications. The model can write code to control computers via Playwright, issue mouse/keyboard commands, and operate autonomously with customizable safety policies. On OSWorld-Verified, the model achieves 75.0% performance.

Efficiency and Context: GPT-5.4 is the most token-efficient reasoning model yet, significantly reducing token usage compared to GPT-5.2 while delivering faster responses. The model supports up to 1M tokens of context, allowing agents to plan, execute, and verify complex tasks across extended horizons.

Reasoning Transparency: In ChatGPT, GPT-5.4 Thinking can now provide upfront thinking plans, allowing users to adjust course mid-response before reaching final outputs—reducing back-and-forth iterations.

Factuality and Tool Integration

The model reduces hallucinations significantly: individual claims are 33% less likely to be false and full responses are 18% less likely to contain errors compared to GPT-5.2. GPT-5.4 improves tool usage with new "tool search" functionality, helping agents find and use the right tools more efficiently across large ecosystems of integrations.

Developer Action Items

ChatGPT users: Try GPT-5.4 Thinking or Pro in the UI for knowledge work and spreadsheet/document creation
Enterprise users: Check out the newly launched ChatGPT for Excel add-in for spreadsheet workflows
API/Codex developers: Implement computer-use capabilities for agentic workflows; review updated spreadsheet and presentation skills in the OpenAI GitHub repository

GPT-5.4: A New Frontier Model for Professional Work

Key Capabilities and Improvements

Factuality and Tool Integration

Developer Action Items

Products

Tags

Published

Source

Related News