OpenAI launches GPT-5.4 mini and nano; smaller models match larger model performance on coding tasks

New Efficient Models for Production Workloads

OpenAI has released GPT-5.4 mini and GPT-5.4 nano, optimized smaller models built for high-volume, latency-sensitive production workloads. These models bring much of the capability of the flagship GPT-5.4 to faster, more affordable options designed for scenarios where response time directly impacts user experience.

GPT-5.4 mini represents a substantial upgrade over GPT-5 mini, with improvements across coding, reasoning, multimodal understanding, and tool use. The model runs more than 2x faster than its predecessor and delivers a strong performance-per-latency tradeoff. On SWE-Bench Pro, it achieves 54.4% accuracy compared to GPT-5.4's 57.7%—within 3.3 percentage points while executing significantly faster.

GPT-5.4 nano serves the smallest-model use case, optimized for classification, data extraction, ranking, and simpler coding subagents. It's priced at $0.20 per 1M input tokens and $1.25 per 1M output tokens, making it the most cost-effective option for volume-heavy applications.

Use Cases and Deployment

These models excel in specific production patterns:

Coding assistants: GPT-5.4 mini handles targeted edits, codebase navigation, front-end generation, and debugging with low latency
Subagent architectures: Larger models like GPT-5.4 can handle planning and coordination, delegating narrower subtasks to GPT-5.4 mini subagents that execute in parallel—particularly useful in systems like OpenAI's Codex
Computer use and multimodal tasks: GPT-5.4 mini shows strong performance on screenshot interpretation (72.1% on OSWorld-Verified), nearly matching GPT-5.4's 75.0%
Real-time applications: Both models support tool use, function calling, web search, computer use, and file operations

Pricing and Availability

GPT-5.4 mini is available across:

API: $0.75 per 1M input tokens, $4.50 per 1M output tokens; supports 400k context window with text, images, tools, web search, file search, and computer use
Codex: Uses only 30% of GPT-5.4 quota, offering ~1/3 the cost for simpler tasks; available in Codex app, CLI, IDE extension, and web
ChatGPT: Available to Free and Go users via Thinking feature; serves as a rate-limit fallback for paid GPT-5.4 Thinking users

GPT-5.4 nano is available in the API only at $0.20 per 1M input tokens and $1.25 per 1M output tokens.

Both models are available immediately.

New Efficient Models for Production Workloads

Use Cases and Deployment

Pricing and Availability

Products

Tags

Published

Source

Related News