Completion finish_reason is missing when stream=true

When the stream option is enabled, the completions response is missing a finish_reason on the last event.

Here’s an example response from the playground:

data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": "Hello?\n", "index": 0, "logprobs": {"tokens": ["Hello", "?", "\n"], "token_logprobs": [null, -5.6045575, -0.15610844], "top_logprobs": null, "text_offset": [0, 5, 6]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": "\n", "index": 0, "logprobs": {"tokens": ["\n"], "token_logprobs": [-0.0003744577], "top_logprobs": null, "text_offset": [7]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": "Hi", "index": 0, "logprobs": {"tokens": ["Hi"], "token_logprobs": [-0.23138906], "top_logprobs": null, "text_offset": [8]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": " there", "index": 0, "logprobs": {"tokens": [" there"], "token_logprobs": [-0.67385423], "top_logprobs": null, "text_offset": [10]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": "!", "index": 0, "logprobs": {"tokens": ["!"], "token_logprobs": [-0.36435923], "top_logprobs": null, "text_offset": [16]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": " How", "index": 0, "logprobs": {"tokens": [" How"], "token_logprobs": [-0.13356256], "top_logprobs": null, "text_offset": [17]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": " can", "index": 0, "logprobs": {"tokens": [" can"], "token_logprobs": [-0.0740416], "top_logprobs": null, "text_offset": [21]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": " I", "index": 0, "logprobs": {"tokens": [" I"], "token_logprobs": [-0.000176637], "top_logprobs": null, "text_offset": [25]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": " help", "index": 0, "logprobs": {"tokens": [" help"], "token_logprobs": [-0.0011030084], "top_logprobs": null, "text_offset": [27]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": " you", "index": 0, "logprobs": {"tokens": [" you"], "token_logprobs": [-0.06487423], "top_logprobs": null, "text_offset": [32]}, "finish_reason": null}], "model": "text-davinci-003"}
data: {"id": "cmpl-6s2UPwe7LCDHOJKUuT2R31GSsZBwp", "object": "text_completion", "created": 1678337301, "choices": [{"text": "?", "index": 0, "logprobs": {"tokens": ["?"], "token_logprobs": [-0.034225788], "top_logprobs": null, "text_offset": [36]}, "finish_reason": null}], "model": "text-davinci-003"}
data: [DONE]

That last event (before the [DONE]) should have its finish_reason set to "stop" rather than null. If I don’t use the stream option then the completion does get the correct finish_reason.

This bug makes things particularly awkward when using the n option alongside stream. In that case, there are multiple completions in the response and without a finish_reason it’s impossible to tell if a particular completion is finished before receiving and processing the whole response.

5 Likes

If anyone runs into this, please ping me, I am not able to replicate this.

1 Like

I think it is in docs here https://platform.openai.com/docs/api-reference/chat/streaming#choices-finish_reason

And

stream

boolean or null

Optional

Defaults to false

If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code.

As @sergeliatko has said, this is expected behaviour from a stream=true API call.

Upon receiving the DONE marker, you can end your streaming session, keep a timer running and set that to some reasonable value between tokens, say 15 seconds, and then you have a robust method to detect failed comms.