New Compact Models Released
OpenAI announced GPT-5.4 mini and GPT-5.4 nano, two new smaller models designed for latency-sensitive and cost-conscious applications. Both models are now available in the OpenAI API, with GPT-5.4 mini also available in ChatGPT and Codex.
Performance Improvements
GPT-5.4 mini significantly outperforms its predecessor across multiple dimensions:
- Coding benchmarks: 54.4% on SWE-Bench Pro (up from 45.7% for GPT-5 mini), with performance approaching GPT-5.4's 57.7%
- Speed: Runs more than 2x faster than GPT-5 mini at similar latencies
- Tool use: 42.9% on Toolathlon (vs 26.9% for GPT-5 mini)
- Reasoning: 88.0% on GPQA Diamond, 72.1% on OSWorld-Verified tasks
- Multimodal: Strong performance on computer use tasks with 72.1% on OSWorld-Verified
GPT-5.4 nano targets ultra-low-cost scenarios with 52.4% on SWE-Bench Pro, recommended for classification, data extraction, ranking, and simple coding subagents.
Use Cases & Architecture
The models excel in workflows where latency directly impacts user experience: coding assistants, subagent architectures (where a larger model delegates to smaller models), computer-using systems processing screenshots, and real-time multimodal applications.
GPT-5.4 mini is particularly suited for subagent patterns, where GPT-5.4 handles planning and coordination while mini instances execute focused subtasks in parallel—reducing latency and cost compared to running everything on the larger model.
Availability & Pricing
| Model | Input Cost | Output Cost | Features | Availability |
|---|---|---|---|---|
| GPT-5.4 mini | $0.75/1M | $4.50/1M | 400k context, tool use, function calling, web search, file search, computer use, skills | API, ChatGPT, Codex |
| GPT-5.4 nano | $0.20/1M | $1.25/1M | Text inputs, basic tool use | API only |
In Codex, GPT-5.4 mini uses only 30% of the GPT-5.4 quota, making it approximately one-third the cost for simpler coding tasks. In ChatGPT, it's available to Free/Go users and as a fallback for GPT-5.4 Thinking users.