https://github.com/openai/openai-python/blob/e389823ba013a24b4c32ce38fa0bd87e6bccae94/openai/api_requestor.py#L760
...
resp = OpenAIResponse(data, rheaders)
# In the future, we might add a "status" parameter to errors
# to better handle the "error while streaming" case.
stream_error = stream and "error" in resp.data
if stream_error or not 200 <= rcode < 300:
raise self.handle_error_response(
rbody, rcode, resp.data, rheaders, stream_error=stream_error
)
return resp
In mid-stream. This can be replicated by output that runs over five minutes (like n=50 and long AI responses). You still get an SSE chunk back but 0 length python bytestream as part of error message (I don’t have a copy of the error without spending more money). I suppose it could be handled as finish_reason if you detect it, abandoning the subscription and connection.
If you consume the raw stream data, mostly it’s a non-standard SSE chunk, without the data: start. And it’s a json string, parse that string you will get an error data like this:
"error": {
"message": "3 is greater than the maximum of 2 - 'temperature'",
"type": "invalid_request_error",
"param": null,
"code": null,
}