OpenAI releases GPT-5.4 with native computer-use capabilities and 83% win rate on knowledge work

Overview

OpenAI has released GPT-5.4, a new frontier model designed for professional work across ChatGPT, the API, and Codex. The model brings together advances in reasoning, coding, and agentic workflows, incorporating industry-leading coding capabilities while improving performance on knowledge work tasks like spreadsheets, presentations, and documents.

Knowledge Work Improvements

GPT-5.4 delivers substantial improvements on professional tasks:

GDPval benchmark: Achieves 83.0% win rate against industry professionals across 44 occupations, up from 70.9% for GPT-5.2
Spreadsheet modeling: Achieves 87.3% on internal investment banking analyst benchmarks, compared to 68.4% for GPT-5.2
Presentations: Human raters prefer GPT-5.4 outputs 68% of the time due to superior aesthetics and visual variety
Factuality: Individual claims are 33% less likely to be false; full responses are 18% less likely to contain errors

Computer Use and Agent Capabilities

GPT-5.4 is the first general-purpose OpenAI model with native computer-use capabilities, enabling agents to operate computers and complete complex workflows:

Achieves 75.0% success rate on OSWorld-Verified (desktop navigation via screenshots and keyboard/mouse), exceeding human performance at 72.4%
Supports up to 1M tokens of context for long-horizon task planning and execution
Introduces tool search to help agents find and use the right tools efficiently
Steerable via developer messages with configurable safety policies
Excellent at writing code via libraries like Playwright

Performance and Efficiency

GPT-5.4 is OpenAI's most token-efficient reasoning model yet, delivering faster speeds and reduced token costs compared to GPT-5.2. Additional benchmark improvements include:

SWE-Bench Pro: 57.7% (up from 55.6% for GPT-5.2)
Toolathlon: 54.6% (up from 46.3% for GPT-5.2)
BrowseComp: 82.7% (up from 65.8% for GPT-5.2)

Availability and Action Items

ChatGPT: GPT-5.4 Thinking and GPT-5.4 Pro now available
API: Full access to GPT-5.4 with computer-use capabilities and 1M token context window
Codex: Updated spreadsheet and presentation skills available
Excel integration: New ChatGPT for Excel add-in launched for Enterprise customers

Developers building agents and professionals using ChatGPT can immediately leverage these capabilities for complex workflow automation and knowledge work tasks.

Overview

Knowledge Work Improvements

Computer Use and Agent Capabilities

Performance and Efficiency

Availability and Action Items

Products

Tags

Published

Source

Related News