OpenAI launches GPT-5.4 with computer-use capabilities and 83% professional performance parity

Key Capabilities

OpenAI has released GPT-5.4, a new frontier model available in ChatGPT (as GPT-5.4 Thinking and Pro variants), the OpenAI API, and Codex. The model combines advances in reasoning, coding, and agentic workflows, incorporating state-of-the-art coding capabilities while improving performance across professional tasks like spreadsheet modeling, presentations, and document creation.

Computer Use & Agent Features

GPT-5.4 is OpenAI's first general-purpose model with native computer-use capabilities, enabling agents to operate computers and execute complex workflows across applications. Key features include:

Computer control: Developers can build agents that navigate desktop environments via screenshots, keyboard, and mouse commands, as well as programmatically via libraries like Playwright
Extended context: Supports up to 1M tokens of context, allowing agents to plan, execute, and verify tasks across long horizons
Tool search: New capability helps agents discover and use the right tools more efficiently without sacrificing intelligence
Steerable behavior: Developers can adjust the model's behavior and configure safety policies to match their risk tolerance

On the OSWorld-Verified benchmark (desktop navigation through screenshots and keyboard/mouse actions), GPT-5.4 achieves 75.0% success—exceeding human performance at 72.4% and significantly outpacing GPT-5.2's 47.3%.

Professional Knowledge Work

GPT-5.4 delivers notable improvements in real-world professional tasks. On the GDPval benchmark (testing agents across 44 occupations), GPT-5.4 matches or exceeds industry professionals in 83.0% of comparisons versus 70.9% for GPT-5.2. Notable gains include:

Spreadsheet modeling: 87.3% mean score on junior investment banking tasks (vs. 68.4% for GPT-5.2)
Presentations: Human raters preferred GPT-5.4 outputs 68% of the time due to stronger aesthetics and visual variety
Factual accuracy: 33% fewer false claims and 18% fewer errors overall compared to GPT-5.2

ChatGPT users can leverage the new ChatGPT for Excel add-in, while developers gain access to updated spreadsheet and presentation skills in Codex and the API.

Token Efficiency & Reasoning

GPT-5.4 is OpenAI's most token-efficient reasoning model to date, requiring significantly fewer tokens to solve problems compared to GPT-5.2, translating to reduced costs and faster inference speeds.

In ChatGPT, the GPT-5.4 Thinking variant now provides upfront thinking plans that users can review and adjust mid-response, improving alignment with user needs and reducing back-and-forth iterations.

Availability

GPT-5.4 is immediately available in ChatGPT, the OpenAI API, and Codex. Enterprise customers should explore the newly released ChatGPT for Excel integration for spreadsheet-heavy workflows.

Key Capabilities

Computer Use & Agent Features

Professional Knowledge Work

Token Efficiency & Reasoning

Availability

Products

Tags

Published

Source

Related News