OpenAI releases GPT-5.4, achieving 83% win rate on professional knowledge work tasks
Overview
OpenAI has released GPT-5.4, a new frontier model designed specifically for professional work. The model is available immediately in ChatGPT (as GPT-5.4 Thinking and GPT-5.4 Pro), the OpenAI API, and Codex. GPT-5.4 combines advances in reasoning, coding, and agentic workflows, building on GPT-5.3-Codex's coding capabilities while improving performance across tools, software environments, and professional tasks like spreadsheets, presentations, and documents.
Key Improvements
Knowledge Work & Professional Tasks
- Achieves 83.0% win rate on GDPval (professional knowledge work across 44 occupations), up from 70.9% for GPT-5.2
- On spreadsheet modeling benchmarks, scores 87.3% compared to 68.4% for GPT-5.2
- Human raters preferred GPT-5.4-generated presentations 68.0% of the time over GPT-5.2 for aesthetics and visual effectiveness
- 33% fewer false claims and 18% fewer errors in full responses compared to GPT-5.2
Computer Use & Agents
- First general-purpose OpenAI model with native computer-use capabilities, enabling agents to operate computers across websites and software systems
- Supports 1M token context windows for long-horizon task planning and execution
- New tool search feature helps agents find and use appropriate tools more efficiently
- Most token-efficient reasoning model yet—reduces token usage and speeds compared to GPT-5.2
ChatGPT Features
- GPT-5.4 Thinking now provides upfront planning, allowing users to adjust course mid-response before the final output
- Improved deep web research for highly specific queries
- Better context maintenance for extended thinking tasks
Developer Impact
Developers building agents and using the API gain several advantages:
- Native computer-use capabilities enable more reliable agents for complex automation tasks
- Significantly reduced token consumption improves cost efficiency and response speed
- Updated spreadsheet and presentation skills available in Codex and the API
- Custom safety and confirmation policies allow developers to adjust behavior for different use cases
Enterprise customers can also leverage the newly launched ChatGPT for Excel add-in for enhanced spreadsheet work.