← Back
OpenAI
OpenAI releases GPT-5.4 mini and nano; smaller models match larger peer performance at 2x faster speeds
OpenAI APIChatGPTOpenAI · releasemodelfeatureapiperformance · openai.com ↗

New Compact Models Available Today

OpenAI has released GPT-5.4 mini and GPT-5.4 nano, streamlined versions of its flagship GPT-5.4 model. Both models are now available in the OpenAI API, with GPT-5.4 mini also available in Codex and ChatGPT. These models prioritize speed and efficiency for workloads where latency directly impacts user experience.

Performance and Capabilities

GPT-5.4 mini delivers substantial improvements across coding, reasoning, multimodal understanding, and tool use compared to GPT-5 mini:

  • Achieves 54.4% accuracy on SWE-Bench Pro (versus GPT-5 mini's 45.7%), approaching GPT-5.4's 57.7%
  • Runs more than 2x faster than GPT-5 mini at similar latency
  • Scores 72.1% on OSWorld-Verified for computer use tasks, nearly matching GPT-5.4's 75.0%
  • Includes support for text, images, tool use, web search, file search, computer use, and a 400k context window

GPT-5.4 nano is the smallest and cheapest option, recommended for classification, data extraction, ranking, and simpler coding subagents. While delivering lower throughput than mini, it maintains strong cost efficiency.

Pricing and Availability

Model Input Cost Output Cost Availability
GPT-5.4 mini $0.75 per 1M tokens $4.50 per 1M tokens API, Codex, ChatGPT
GPT-5.4 nano $0.20 per 1M tokens $1.25 per 1M tokens API only

In Codex, GPT-5.4 mini consumes only 30% of GPT-5.4 quota, allowing developers to handle simpler tasks at approximately one-third the cost. Both models support Codex's subagent pattern, enabling larger models to delegate narrow tasks to faster, cheaper mini instances running in parallel.

Key Use Cases

The models are optimized for scenarios where responsiveness matters:

  • Coding assistants that need instant feedback for targeted edits and debugging
  • Subagent architectures where a primary model delegates to specialized workers
  • Computer use systems that interpret screenshots and navigate UIs in real-time
  • Multimodal applications requiring fast image reasoning

Performance-per-latency tradeoffs significantly favor GPT-5.4 mini for coding workflows, delivering coding quality approaching GPT-5.4 while maintaining sub-second response times.