OpenAI has launched GPT-5.4 Mini and GPT-5.4 Nano, introducing faster and more cost-efficient AI models designed for real-world workloads. The new models focus on delivering strong performance while reducing latency and operational costs.

GPT-5.4 Mini brings major upgrades across coding, reasoning, and multimodal tasks. It runs more than twice as fast as GPT-5 Mini while maintaining performance levels close to the full GPT-5.4 model.
Developers can use it for:
- Code generation and debugging
- Front-end and UI development
- Tool-based workflows
- Multimodal tasks like image understanding
In benchmarks such as SWE-Bench Pro and OSWorld-Verified, the model nearly matches GPT-5.4, making it a strong choice for production systems where speed matters.
It also supports:
- Text and image inputs
- Function calling and tool use
- Web and file search
- Computer-use tasks
With a 400K context window, it handles large datasets and long workflows without breaking performance.
Built for Real-Time AI Workflows
OpenAI designed GPT-5.4 Mini for environments where latency directly impacts user experience.
That includes:
- Coding assistants that need instant responses
- AI subagents handling background tasks
- Apps that process screenshots in real time
- Systems that rely on fast decision-making
Instead of relying on one large model, developers can now combine models. A larger model handles planning, while Mini executes tasks quickly at scale.
GPT-5.4 Mini and Nano Pricing
OpenAI increased pricing slightly to match the performance improvements:
- GPT-5.4 Mini
- $0.75 per 1M input tokens
- $4.50 per 1M output tokens
Even with higher pricing, the model reduces overall costs in workflows. In tools like Codex, it uses only 30% of the GPT-5.4 quota, making it a cost-efficient fallback.
Users can access it:
- In ChatGPT via the “Thinking” feature
- Across API and Codex tools
- As a fallback for higher-tier models
GPT-5.4 Nano Targets High-Volume Tasks
GPT-5.4 Nano focuses on one key goal: handling large-scale workloads at the lowest possible cost. It is designed for speed and efficiency, making it a strong choice for systems that process high volumes of repetitive tasks.
Instead of trying to replace larger models, Nano works best in the background. It handles tasks that run continuously, where fast responses and low cost matter more than deep reasoning.
You can use GPT-5.4 Nano for:
- Data classification
- Information extraction
- Ranking and filtering systems
- Lightweight coding and automation
These tasks often run at scale, and even small improvements in efficiency can reduce overall costs significantly.
GPT-5.4 Nano also improves over GPT-5 Nano while keeping pricing very low:
- $0.20 per 1M input tokens
- $1.25 per 1M output tokens
Unlike GPT-5.4 Mini, Nano is available only through the API. This makes it ideal for backend systems such as automated pipelines, real-time processing tools, and multi-agent workflows.
In practical use, developers can rely on larger models for decision-making and use Nano to execute tasks quickly at scale. This approach improves performance while keeping costs under control.
This launch makes one thing clear: the future of AI is not just bigger—it is smarter and faster.
