Staging this presentation format here in the lounge.
OpenAI Responses API streaming events — developer field guide
This reference organizes every Server-Sent Event (SSE) you may see from the Responses API when streaming. It groups events by purpose, shows typical lifecycles, and tells you which fields to read and what to do when each event arrives.
Quick notes:
- Each SSE is two lines then a blank line:
- event:
- data:
- Deltas must be appended in order by sequence_number; the final “done” event carries the completed string for that piece.
- True token usage is only present in response.completed.
- Output is organized into output items (messages, tool calls, reasoning, etc.). Most “content” streams inside those items using add/delta/done pairs.
Total event types covered: 53
Table A. Response “envelope” lifecycle (always or almost-always seen)
These describe the overall response object status (independent of particular content items).
| event | when it appears | key fields in data |
typical handling |
|---|---|---|---|
| response.queued | When the request is accepted and placed in queue | response, sequence_number | Mark request as queued; show “queued” state if desired. |
| response.created | Immediately after creation; status usually in_progress | response, sequence_number | Initialize global state from echoed parameters; stash response.id. |
| response.in_progress | Emitted while the response is being generated | response, sequence_number | Update status; note that many params are echoed repeatedly. |
| response.completed | Final envelope for successful responses | response (full final output array, usage), sequence_number | Stop streaming; read usage.input_tokens, usage.output_tokens, total_tokens; persist completed output in your store. |
| response.incomplete | Final envelope when generation stops early (e.g., max_output_tokens, content_filter) | response (incomplete_details.reason), sequence_number | Stop; surface incomplete reason; downstream consumers should not expect more deltas. |
| response.failed | Final envelope on generation failure | response.error (code, message), sequence_number | Stop; surface error; retry or escalate. |
| error | Transport/stream error event (out-of-band vs. response.failed) | code, message, param, sequence_number | Treat as stream error; abort and optionally retry. |
Table B. Output item assembly (message/refusal text streaming, content parts)
These events add and finalize pieces inside the response.output array.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A new output item (e.g., assistant message) is added | output_index, item (id, type, role/status), sequence_number | Create per-item buffers keyed by item.id and output_index. |
| response.content_part.added | A new content part is added to an item (e.g., an output_text part) | item_id, output_index, content_index, part, sequence_number | Initialize buffers for this content part (often an empty string). |
| response.output_text.delta | Next chunk of assistant text tokens | item_id, output_index, content_index, delta, logprobs?, sequence_number | Append delta to your text buffer; optionally record logprobs. |
| response.output_text.done | Final full text for the content part | item_id, output_index, content_index, text, logprobs?, sequence_number | Replace/confirm buffer with finalized text; emit to UI. |
| response.output_text.annotation.added | Annotation added to output text | item_id, output_index, content_index, annotation_index, annotation, sequence_number | Attach annotation metadata (citations, file paths, etc.) to the text segment it references. |
| response.refusal.delta | Partial refusal string (instead of normal text) | item_id, output_index, content_index, delta, sequence_number | Append refusal text to refusal buffer for the same content part. |
| response.refusal.done | Final refusal string | item_id, output_index, content_index, refusal, sequence_number | Finalize refusal; present as refusal message. |
| response.content_part.done | The content part is complete | item_id, output_index, content_index, part, sequence_number | Mark the part closed; no more deltas for this part. |
| response.output_item.done | The output item is complete (e.g., message.status=completed) | output_index, item, sequence_number | Mark the entire item closed; safe to render or persist it as final. |
Table C. Reasoning summary streaming (reasoning models)
Some models stream a short, shareable “reasoning summary” separate from private chain-of-thought.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A reasoning item is created (type=reasoning) | output_index, item (id, type=reasoning), sequence_number | Initialize reasoning-summary buffers for this item. |
| response.reasoning_summary_part.added | A new reasoning summary “part” starts | item_id, output_index, summary_index, part, sequence_number | Initialize per-part buffer (usually empty text). |
| response.reasoning_summary_text.delta | Next text chunk for the current summary part | item_id, output_index, summary_index, delta, sequence_number | Append delta to the part buffer. |
| response.reasoning_summary_text.done | Final text for the current summary part | item_id, output_index, summary_index, text, sequence_number | Finalize the part buffer. |
| response.reasoning_summary_part.done | Marks the part complete | item_id, output_index, summary_index, part, sequence_number | Close the part; no more deltas for this part. |
| response.output_item.done | The reasoning item is complete | output_index, item, sequence_number | Finalize reasoning preview content for display/logging. |
Table D. Reasoning text streaming (GPT-OSS models)
Reasoning textual content (not the short summary) can also stream in parts.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.content_part.added | A reasoning_text content part is added | item_id, output_index, content_index, part, sequence_number | Initialize a buffer for this reasoning content. |
| response.reasoning_text.delta | Streaming reasoning text | item_id, output_index, content_index, delta, sequence_number | Append delta to reasoning-text buffer (usually not shown to end-users). |
| response.reasoning_text.done | Final full reasoning text | item_id, output_index, content_index, text, sequence_number | Finalize buffer; store if your app uses it. |
| response.content_part.done | The content part is complete | item_id, output_index, content_index, part, sequence_number | Close the part. |
Table E. Function calling (your “function” tools)
These events stream JSON arguments for a function_call tool item.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A function_call item is created | output_index, item (id, type=function_call, call_id, name), sequence_number | Track tool call session by item_id/call_id. |
| response.function_call_arguments.delta | Streaming JSON arguments for the call | item_id, output_index, delta, sequence_number | Append raw JSON to an args buffer (string); don’t parse until done. |
| response.function_call_arguments.done | Final arguments are available | item_id, output_index, name, arguments, sequence_number | Parse JSON; invoke your function with typed args. |
| response.output_item.done | The function_call item is complete | output_index, item, sequence_number | Consider the call “issued”; await your app’s tool output. |
Table F. Custom tool calling (your “custom” tools)
When the model calls a custom tool (not “function”), it streams the tool’s input.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A custom_tool_call item is created | output_index, item (id, type=custom_tool_call, call_id, name), sequence_number | Begin buffering the tool’s input. |
| response.custom_tool_call_input.delta | Streaming input for the tool | item_id, output_index, delta, sequence_number | Append raw input; wait for completion to parse/consume. |
| response.custom_tool_call_input.done | Final complete input | item_id, output_index, input, sequence_number | Consume/dispatch to your tool implementation. |
| response.output_item.done | The custom tool call item is complete | output_index, item, sequence_number | Transition state to “awaiting tool output”. |
Table G. MCP tool calls (remote servers/connectors)
Covers both argument streaming to an MCP call and its status, plus listing server tools.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | An mcp_call item is created | output_index, item (id, type=mcp_call, server_label, name, approval_request_id?), sequence_number | Track MCP call session. |
| response.mcp_call_arguments.delta | Streaming MCP arguments (JSON string) | item_id, output_index, delta, sequence_number | Append to args buffer. |
| response.mcp_call_arguments.done | Final MCP arguments | item_id, output_index, arguments, sequence_number | Parse; dispatch to MCP server. |
| response.mcp_call.in_progress | Call has started | item_id, output_index, sequence_number | Reflect “calling” status in UI/logs. |
| response.mcp_call.completed | Call finished successfully | item_id, output_index, sequence_number | Collect any outputs contained in the item. |
| response.mcp_call.failed | Call failed | item_id, output_index, sequence_number | Surface failure; consider fallback. |
| response.output_item.done | The mcp_call item is complete | output_index, item, sequence_number | Finalize item. |
| response.mcp_list_tools.in_progress | Listing tools on an MCP server | item_id, output_index, sequence_number | Show loading state for tool discovery. |
| response.mcp_list_tools.completed | Tools listing succeeded | item_id, output_index, sequence_number | Read tools list from the item; cache it. |
| response.mcp_list_tools.failed | Tools listing failed | item_id, output_index, sequence_number | Inform the user; retry or reconfigure. |
Table H. Built-in File Search tool calls
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A file_search_call item is created | output_index, item (id, type=file_search_call, status), sequence_number | Track search session. |
| response.file_search_call.in_progress | Search call started | output_index, item_id, sequence_number | Show “preparing search”. |
| response.file_search_call.searching | Actively searching | output_index, item_id, sequence_number | Show “searching…”. |
| response.file_search_call.completed | Search finished | output_index, item_id, sequence_number | Read results array from the item (file_id, text, score, etc.). |
| response.output_item.done | The file_search_call item is complete | output_index, item, sequence_number | Finalize item. |
Table I. Built-in Web Search tool calls
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A web_search_call item is created | output_index, item (id, type=web_search_call, action.search/open_page/find), sequence_number | Start tracking search session and actions. |
| response.web_search_call.in_progress | Web search started | output_index, item_id, sequence_number | Show “preparing web search”. |
| response.web_search_call.searching | Searching/visiting pages | output_index, item_id, sequence_number | Update UI; actions embedded in item.action. |
| response.web_search_call.completed | Search call finished | output_index, item_id, sequence_number | Read final status/results from item; cite sources. |
| response.output_item.done | The web_search_call item is complete | output_index, item, sequence_number | Finalize item. |
Table J. Code Interpreter tool calls
Two classes of events: overall call status, and streaming of the code snippet (when present).
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | A code_interpreter_call item is created | output_index, item (id, status, container_id), sequence_number | Track code run session and container. |
| response.code_interpreter_call.in_progress | Call accepted/starting | output_index, item_id, sequence_number | Show “starting interpreter…”. |
| response.code_interpreter_call.interpreting | Interpreter is running | output_index, item_id, sequence_number | Show “running code…”. |
| response.code_interpreter_call_code.delta | Streaming the code snippet | output_index, item_id, delta, sequence_number | Append code text buffer; useful for preview/logging. |
| response.code_interpreter_call_code.done | Final code snippet | output_index, item_id, code, sequence_number | Finalize code buffer displayed/logged. |
| response.code_interpreter_call.completed | Call finished | output_index, item_id, sequence_number | Read outputs (logs/images) from the item. |
| response.output_item.done | The code_interpreter_call item is complete | output_index, item, sequence_number | Finalize item. |
Table K. Image Generation tool calls
Supports partial preview images, then a final image result.
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.output_item.added | An image_generation_call item is created | output_index, item (id, status), sequence_number | Track image session. |
| response.image_generation_call.in_progress | Call accepted | output_index, item_id, sequence_number | Show “starting image job…”. |
| response.image_generation_call.generating | Actively generating | output_index, item_id, sequence_number | Update progress indicator. |
| response.image_generation_call.partial_image | Partial image chunk (base64) | output_index, item_id, partial_image_index, partial_image_b64, sequence_number | Render progressive preview (data:image/png;base64, …). |
| response.image_generation_call.completed | Generation finished | output_index, item_id, sequence_number | Read final base64 image from item.result (or item outputs). |
| response.output_item.done | The image_generation_call item is complete | output_index, item, sequence_number | Finalize item and thumbnails/gallery. |
Table L. Audio output and transcript (text-to-speech style streaming, unreleased)
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.audio.delta | Partial audio bytes | delta (base64 bytes), sequence_number | Append to audio stream buffer; play progressively if supported. |
| response.audio.done | Audio stream complete | sequence_number | Close audio buffer; flush to player/file. |
| response.audio.transcript.delta | Partial transcript text | delta, sequence_number | Append to transcript buffer. |
| response.audio.transcript.done | Transcript complete | sequence_number | Finalize transcript. |
Table M. Miscellaneous content and refinement
| event | when it appears | key fields | typical handling |
|---|---|---|---|
| response.content_part.added | New content part for any item | item_id, output_index, content_index, part | Start a per-part buffer (text, reasoning_text, etc.). |
| response.content_part.done | That content part is finalized | item_id, output_index, content_index, part | Close the part; no more deltas for it. |
Lifecycle “recipes” (ordered sequences you will commonly see)
-
Plain assistant text response
- response.created → response.in_progress
- response.output_item.added (message) → response.content_part.added (output_text)
- Many response.output_text.delta → response.output_text.done
- response.content_part.done → response.output_item.done (message)
- response.completed (with usage)
-
Reasoning summary + assistant text (reasoning models)
- response.created → response.in_progress
- response.output_item.added (reasoning)
- response.reasoning_summary_part.added → many reasoning_summary_text.delta → reasoning_summary_text.done → reasoning_summary_part.done → response.output_item.done (reasoning)
- response.output_item.added (message) → response.content_part.added (output_text)
- Many output_text.delta → output_text.done → content_part.done → output_item.done (message)
- response.completed (usage present)
-
Function tool call
- response.output_item.added (function_call)
- function_call_arguments.delta × N → function_call_arguments.done (parse JSON)
- output_item.done (function_call), then your app runs the function and later posts tool outputs in a subsequent request turn
-
Custom tool call
- response.output_item.added (custom_tool_call)
- custom_tool_call_input.delta × N → custom_tool_call_input.done (consume input)
- output_item.done
-
MCP tool call
- response.output_item.added (mcp_call)
- mcp_call_arguments.delta × N → mcp_call_arguments.done
- mcp_call.in_progress → (server work) → mcp_call.completed or mcp_call.failed
- output_item.done
-
File search
- response.output_item.added (file_search_call)
- file_search_call.in_progress → file_search_call.searching → file_search_call.completed
- output_item.done
-
Web search
- response.output_item.added (web_search_call)
- web_search_call.in_progress → web_search_call.searching → web_search_call.completed
- output_item.done
-
Code interpreter
- response.output_item.added (code_interpreter_call)
- code_interpreter_call.in_progress → code_interpreter_call.interpreting
- code_interpreter_call_code.delta × N → code_interpreter_call_code.done
- code_interpreter_call.completed → output_item.done
-
Image generation
- response.output_item.added (image_generation_call)
- image_generation_call.in_progress → image_generation_call.generating
- image_generation_call.partial_image × 0..3
- image_generation_call.completed → output_item.done
-
Text refusal
- response.output_item.added (message) → content_part.added (refusal or output_text with refusal)
- refusal.delta × N → refusal.done → content_part.done → output_item.done
Practical handler checklist
- Always parse and route by the “event:” line; then json.loads the “data:” blob.
- Maintain maps:
- response.id → global response state
- (output_index, item_id) → per-item state
- (item_id, content_index) → per-content buffer
- (item_id, summary_index) → per-reasoning-summary buffer
- Append .delta text in order by sequence_number; only trust the .done text as ground truth to finalize a part.
- Only read usage from response.completed. Do not sum deltas to infer tokens.
- If you see response.incomplete or response.failed, stop assembling; present the reason or error.
- Some payloads may include optional fields (e.g., logprobs, annotations, obfuscation); ignore unknown keys safely.
- It is safe to render incrementally from deltas, but reconcile with the .done text when it arrives.
Quick index by category
- Envelope: response.queued, response.created, response.in_progress, response.completed, response.incomplete, response.failed, error
- Output item/text: response.output_item.added, response.content_part.added, response.output_text.delta, response.output_text.done, response.output_text.annotation.added, response.content_part.done, response.output_item.done, response.refusal.delta, response.refusal.done
- Reasoning (summary): response.reasoning_summary_part.added, response.reasoning_summary_text.delta, response.reasoning_summary_text.done, response.reasoning_summary_part.done
- Reasoning (text): response.reasoning_text.delta, response.reasoning_text.done
- Function tools: response.function_call_arguments.delta, response.function_call_arguments.done
- Custom tools: response.custom_tool_call_input.delta, response.custom_tool_call_input.done
- MCP tools: response.mcp_call_arguments.delta, response.mcp_call_arguments.done, response.mcp_call.in_progress, response.mcp_call.completed, response.mcp_call.failed, response.mcp_list_tools.in_progress, response.mcp_list_tools.completed, response.mcp_list_tools.failed
- File search: response.file_search_call.in_progress, response.file_search_call.searching, response.file_search_call.completed
- Web search: response.web_search_call.in_progress, response.web_search_call.searching, response.web_search_call.completed
- Code interpreter: response.code_interpreter_call.in_progress, response.code_interpreter_call.interpreting, response.code_interpreter_call_code.delta, response.code_interpreter_call_code.done, response.code_interpreter_call.completed
- Image generation: response.image_generation_call.in_progress, response.image_generation_call.generating, response.image_generation_call.partial_image, response.image_generation_call.completed
- Audio: response.audio.delta, response.audio.done, response.audio.transcript.delta, response.audio.transcript.done
Appendix: Per-event quick reference (flat table)
This condensed table lists every event type once for easy scanning.
| event | purpose | key fields | do this |
|---|---|---|---|
| response.queued | Request queued | response | Mark queued. |
| response.created | Response object created | response | Init state, capture response.id. |
| response.in_progress | Work ongoing | response | Keep-alive/update status. |
| response.completed | Work finished OK | response (final output, usage) | Stop; read usage; persist. |
| response.incomplete | Ended early | response.incomplete_details | Stop; show reason. |
| response.failed | Failed | response.error | Stop; show error/retry. |
| error | Stream error | code, message, param | Abort; retry or report. |
| response.output_item.added | New item in output | output_index, item | Make per-item buffers. |
| response.content_part.added | New content part | item_id, output_index, content_index, part | Make per-part buffer. |
| response.output_text.delta | Partial assistant text | item_id, output_index, content_index, delta, logprobs? | Append text. |
| response.output_text.done | Final assistant text | item_id, output_index, content_index, text | Finalize text. |
| response.output_text.annotation.added | Annotation to text | item_id, output_index, content_index, annotation_index, annotation | Attach annotation. |
| response.refusal.delta | Partial refusal | item_id, output_index, content_index, delta | Append refusal text. |
| response.refusal.done | Final refusal | item_id, output_index, content_index, refusal | Finalize refusal. |
| response.content_part.done | Content part finished | item_id, output_index, content_index, part | Close part. |
| response.output_item.done | Output item finished | output_index, item | Close item. |
| response.reasoning_summary_part.added | New reasoning summary part | item_id, output_index, summary_index, part | Make summary-part buffer. |
| response.reasoning_summary_text.delta | Reasoning summary delta | item_id, output_index, summary_index, delta | Append. |
| response.reasoning_summary_text.done | Reasoning summary final | item_id, output_index, summary_index, text | Finalize. |
| response.reasoning_summary_part.done | Summary part done | item_id, output_index, summary_index, part | Close part. |
| response.reasoning_text.delta | Reasoning text delta | item_id, output_index, content_index, delta | Append. |
| response.reasoning_text.done | Reasoning text final | item_id, output_index, content_index, text | Finalize. |
| response.function_call_arguments.delta | Function args delta | item_id, output_index, delta | Append JSON string. |
| response.function_call_arguments.done | Function args final | item_id, output_index, name, arguments | Parse and invoke. |
| response.custom_tool_call_input.delta | Custom tool input delta | item_id, output_index, delta | Append input. |
| response.custom_tool_call_input.done | Custom tool input final | item_id, output_index, input | Consume input. |
| response.mcp_call_arguments.delta | MCP args delta | item_id, output_index, delta | Append JSON string. |
| response.mcp_call_arguments.done | MCP args final | item_id, output_index, arguments | Parse and invoke. |
| response.mcp_call.in_progress | MCP call started | item_id, output_index | Update status. |
| response.mcp_call.completed | MCP call finished | item_id, output_index | Read outputs. |
| response.mcp_call.failed | MCP call failed | item_id, output_index | Surface error. |
| response.mcp_list_tools.in_progress | MCP tool listing started | item_id, output_index | Update status. |
| response.mcp_list_tools.completed | MCP tool listing finished | item_id, output_index | Consume tools list. |
| response.mcp_list_tools.failed | MCP tool listing failed | item_id, output_index | Surface error. |
| response.file_search_call.in_progress | File search started | output_index, item_id | Update status. |
| response.file_search_call.searching | File search running | output_index, item_id | Update status. |
| response.file_search_call.completed | File search finished | output_index, item_id | Read results from item. |
| response.web_search_call.in_progress | Web search started | output_index, item_id | Update status. |
| response.web_search_call.searching | Web search running | output_index, item_id | Update status. |
| response.web_search_call.completed | Web search finished | output_index, item_id | Consume results/action outcome. |
| response.code_interpreter_call.in_progress | Code call started | output_index, item_id | Show “starting”. |
| response.code_interpreter_call.interpreting | Code running | output_index, item_id | Show “running…”. |
| response.code_interpreter_call_code.delta | Code snippet delta | output_index, item_id, delta | Append code preview. |
| response.code_interpreter_call_code.done | Code snippet final | output_index, item_id, code | Finalize code text. |
| response.code_interpreter_call.completed | Code call finished | output_index, item_id | Read outputs (logs/images). |
| response.image_generation_call.in_progress | Image call started | output_index, item_id | Show “starting…”. |
| response.image_generation_call.generating | Image generating | output_index, item_id | Update progress. |
| response.image_generation_call.partial_image | Partial image | output_index, item_id, partial_image_index, partial_image_b64 | Render preview. |
| response.image_generation_call.completed | Image done | output_index, item_id | Show final image. |
| response.audio.delta | Partial audio bytes | delta (base64), sequence_number | Append/play audio. |
| response.audio.done | Audio finished | sequence_number | Close audio stream. |
| response.audio.transcript.delta | Partial transcript | delta, sequence_number | Append transcript. |
| response.audio.transcript.done | Transcript finished | sequence_number | Finalize transcript. |
Implementation tips and invariants
- sequence_number is monotonically increasing per stream; use it to enforce ordering and to ignore duplicates.
- item_id is unique per output item; output_index is its position in response.output.
- content_index applies within an item’s content array; summary_index applies within a reasoning summary list.
- Deltas are additive; “done” carries the authoritative final string.
- You may stop reading after response.completed/incomplete/failed, but many clients drain the socket to EOF politely.
- Only response.completed includes usage; prior echoes of response contain usage: null.
- Unknown keys (e.g., obfuscation, experimental metadata) should be ignored to preserve forward compatibility.
This is an AI product based on:
| input tokens: 155032 | output tokens: 8674 |
|---|---|
| uncached: 155032 | non-reasoning: 6050 |
| cached: 0 | reasoning: 2624 |
Next to come, which had to be omitted: dozens of item types of “instructions” array items and then “output items”, echoed back at you multiple times in different events, the efficiency of using “prompts” only for that to radically amplify your stream anyway, all for your parsing and persistence in new inputs as tolerated by your “ID Verified” status. You even get to see a single “reasoning_summary” at least three times in the stream…