/v1/responses). It brings GPT-5.4 capabilities to a faster, lower-cost model for high-volume workloads.
Key capabilities
- Responses API — Uses the newer
/v1/responsesendpoint withinputinstead ofmessages - Reasoning control — Configure reasoning effort: none (default), low, medium, high, or xhigh
- Coding and agents — Optimized for coding, computer use, and subagent workloads
- Long context — Supports a 400K token context window and up to 128K output tokens
- Multimodal input — Accepts text and image input, with text output
- Tool use — Supports function calling and Responses API tools such as web search, file search, code interpreter, and computer use
- Streaming — Supports real-time token streaming via SSE
Quick example
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Must be gpt-5.4-mini |
input | array | Yes | List of { role, content } objects |
stream | boolean | No | Enable SSE streaming. Default: false |
top_p | float | No | Nucleus sampling threshold. Default: 1 |
max_output_tokens | integer | No | Maximum output tokens to generate |
reasoning | object | No | { effort, summary } — controls reasoning depth. effort supports none, low, medium, high, and xhigh |
text | object | No | { format, verbosity } — controls output format and verbosity |
tools | array | No | List of tools the model may call |
store | boolean | No | Store response for later retrieval. Default: true |
API Reference
View the interactive API playground for GPT-5.4 Mini.