OpenAI releases GPT-5.4 with computer-use capabilities and 83% professional work performance

Key Capabilities

OpenAI has released GPT-5.4 alongside a premium variant GPT-5.4 Pro, marking a major advancement in AI capabilities for professional work. The model is available across ChatGPT (as GPT-5.4 Thinking), the OpenAI API, and Codex.

For ChatGPT users, GPT-5.4 Thinking introduces the ability to provide upfront reasoning plans that can be adjusted mid-response, allowing for course correction without additional turns. The model shows marked improvements in deep web research and context maintenance for complex questions, delivering higher-quality answers more quickly.

For developers and agents, GPT-5.4 introduces native, state-of-the-art computer-use capabilities—the first general-purpose model from OpenAI with this feature. Agents can now operate computers and carry out complex workflows across applications via libraries like Playwright or direct mouse and keyboard commands in response to screenshots. The model supports up to 1M tokens of context and includes tool search functionality to help agents find and use the right tools more efficiently.

Performance Improvements

GPT-5.4 demonstrates significant benchmark improvements across multiple dimensions:

GDPval: Achieves 83.0% on professional knowledge work across 44 occupations, matching or exceeding industry professionals (vs. 70.9% for GPT-5.2)
OSWorld-Verified: 75.0% success rate on desktop navigation tasks, surpassing human performance at 72.4% (vs. 47.3% for GPT-5.2)
SWE-Bench Pro: 57.7% on software engineering benchmarks (vs. 55.6% for GPT-5.2)
Factuality: 33% fewer false individual claims and 18% fewer responses with any errors compared to GPT-5.2

The model shows particular strength in professional knowledge work, including spreadsheet modeling (87.3% on investment banking tasks vs. 68.4% for GPT-5.2) and presentations (68% human preference over GPT-5.2 for aesthetics and visual effectiveness).

Token Efficiency and Developer Tools

GPT-5.4 is OpenAI's most token-efficient reasoning model yet, using significantly fewer tokens to solve problems compared to GPT-5.2, translating to reduced costs and faster speeds. The model's safety behavior is configurable via developer messages, allowing teams to adjust confirmation policies based on their risk tolerance.

OpenAI has also released a new ChatGPT for Excel add-in for Enterprise customers and updated spreadsheet and presentation skills available in Codex and the API. Developers can begin integrating these capabilities immediately through the OpenAI API.

Key Capabilities

Performance Improvements

Token Efficiency and Developer Tools

Products

Tags

Published

Source

Related News