OpenAI chat completion stream response delta contains roles other than `assistant`

Hi,

We recently found out that OpenAI’s OpenAPI specification states that the role property in ChatCompletionStreamResponseDelta object can be, e.g., system, user and tool other than assistant. It seems to be wrong and also for non-streaming case, the role is limited to assistant.

(Since I cannot put links in my post, please prepend the following strings with https://github.com/)

openai/openai-openapi/blob/e0cb2d721753e13e69e918465795d6e9f87ab15a/openapi.yaml#L11494

openai/openai-openapi/blob/e0cb2d721753e13e69e918465795d6e9f87ab15a/openapi.yaml#L11378

Also it seems like the official OpenAI SDK for node also assume this role to be assistant?

openai/openai-node/blob/bc9f15fc7d1f4acf625adc3603577b06d59cdc5c/src/lib/ChatCompletionStream.ts#L623

Could you explain why other roles can be possible in a response chunk? Or confirm that this is indeed a bug? Thanks!

Regards,
Zhongpin

2 Likes

Thanks for raising this — I’ve been digging into the same issue recently and had similar questions.

From what I’ve seen, and based on the OpenAPI spec (openai/openai-openapi/blob/e0cb2d7217…), the role in ChatCompletionChunk responses is typically expected to be "assistant". However, it’s important to note that this assumption doesn’t fully reflect current behavior in real-world use, especially when function calling or tool integrations are involved.

For instance:

  • When using the function calling capability, the assistant might stream a tool call initiation (role: "assistant"), followed by simulated tool responses (role: "tool"), and then continue generating output based on those responses.
  • This means the stream can legitimately include chunks with "tool" — and potentially other roles — depending on the architecture of the conversation.

Looking at the Node SDK (openai/openai-node/blob/bc9f15fc7d…), it seems to assume "assistant" as the default role for streamed chunks, which might simplify handling but doesn’t account for multi-role outputs. That might be fine for basic use cases, but developers building more advanced pipelines will want to explicitly handle the possibility of multiple roles during streaming.

So to your question — I don’t think it’s a bug, but rather a gap between evolving API capabilities and what’s currently assumed or documented in the SDK/spec.

A more robust implementation would benefit from:

  • A clearer indication in the OpenAPI spec that role in a chunked response is not guaranteed to always be "assistant".
  • SDK improvements to gracefully handle multi-role chunks.
  • Developer awareness that role diversity during streaming is a valid (and growing) behavior, especially with tool usage.

Would be great to hear from someone at OpenAI if there’s an update on plans for this in future SDK/spec revisions.

1 Like
1 Like

You are correct. The AI can only produce within its own “assistant” role (which is its hidden writing prompt).

system, user and tool are inputs of context for the AI to complete upon.

In ChatML, the closing of an assistant response message will terminate the output, that being a stop sequence. The AI cannot proceed to write in another role, and there should be no backend “completion role catcher” to split and send them to you somehow were the stop sequence to be avoided or they were to allow disabling built-in special token numbers as stop for a non-terminating completion product. (although OpenAI has had quirky issues not terminating the output with bad AI jumping to restart itself).

With the Responses API endpoint, OpenAI starts to parse out and convolute individual items from the AI production. A different specification.


In the specification, if the $def for this type is not reused as an input context validator, than the other roles do indeed seem redundant or invalid. Having more roles allows to validate more on something impossible to receive, so inconsequential. The backend has to send you “assistant” itself as it is not produced by the AI to start a message.

2 Likes