[Assistants API] Issue with gpt-4o: httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

Hi, I’m encountering an issue while using the Assistants API with the gpt-4o model via the OpenAI Python SDK.

I consistently receive the following error: httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

Log:

Traceback (most recent call last):
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_transports/default.py”, line 101, in map_httpcore_exceptions
yield
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_transports/default.py”, line 127, in iter
for part in self._httpcore_stream:
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py”, line 407, in iter
raise exc from None
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py”, line 403, in iter
for part in self._stream:
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_sync/http11.py”, line 342, in iter
raise exc
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_sync/http11.py”, line 334, in iter
for chunk in self._connection._receive_response_body(**kwargs):
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_sync/http11.py”, line 203, in _receive_response_body
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_sync/http11.py”, line 213, in _receive_event
with map_exceptions({h11.RemoteProtocolError: RemoteProtocolError}):
File “/Users/.pyenv/versions/3.12.2/lib/python3.12/contextlib.py”, line 158, in exit
self.gen.throw(value)
File “/Users/Documents/venv/lib/python3.12/site-packages/httpcore/_exceptions.py”, line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/Users/Documents/app/routes/assistant_stream.py”, line 133, in stream_run_events
for event in run:
File “/Users/Documents/venv/lib/python3.12/site-packages/openai/_streaming.py”, line 46, in iter
for item in self._iterator:
File “/Users/Documents/venv/lib/python3.12/site-packages/openai/_streaming.py”, line 58, in stream
for sse in iterator:
File “/Users/Documents/venv/lib/python3.12/site-packages/openai/_streaming.py”, line 50, in _iter_events
yield from self._decoder.iter_bytes(self.response.iter_bytes())
File “/Users/Documents/venv/lib/python3.12/site-packages/openai/_streaming.py”, line 278, in iter_bytes
for chunk in self._iter_chunks(iterator):
File “/Users/Documents/venv/lib/python3.12/site-packages/openai/_streaming.py”, line 289, in _iter_chunks
for chunk in iterator:
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_models.py”, line 897, in iter_bytes
for raw_bytes in self.iter_raw():
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_models.py”, line 951, in iter_raw
for raw_stream_bytes in self.stream:
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_client.py”, line 153, in iter
for chunk in self._stream:
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_transports/default.py”, line 126, in iter
with map_httpcore_exceptions():
File “/Users/.pyenv/versions/3.12.2/lib/python3.12/contextlib.py”, line 158, in exit
self.gen.throw(value)
File “/Users/Documents/venv/lib/python3.12/site-packages/httpx/_transports/default.py”, line 118, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

Environment:

Model: gpt-4o

OpenAI SDK Version: 2.8.1

6 Likes

I’ve been facing the same problem all day. Is there any official explanation?

I’m using Assistants API with model gpt-4.1, in the streaming mode it returns some of the answer, and then suddenlys throws the “peer closed connection without sending complete message body (incomplete chunked read)“ error.

1 Like

I’ve been seeing the same thing today. It appears to be on responses from OpenAI over a certain size. anyone had any luck?

3 Likes

I am also going trought the same problem with long responses when streaming:

httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
2 Likes

weirdly, for me the issue doesnt occur when using the “gpt-3.5-turbo” model version. that version Isn’t suitable for my use case, but may be useful for others in the meantime.

1 Like

It appears that gpt-3.5-turbo produces the same error with the assistant API. I also tested the Responses API with gpt-4o, and I don’t see the same problem there.

Update: Migrated to the Responses API, and it seems to be working.

This is reproducible on OPENAI api platform . Our assistants are configured with gpt 4.1 model and issue seems to be happening for stream calls.

I am encountering the same streaming issue reported here, both in my applications and when using the Assistants Playground on the OpenAI platform. In my experience, streamed responses tend to cut off prematurely if the request lasts longer than about 15 seconds, whereas streams complete successfully for responses under 15 seconds. This happens consistently across several different models I have tested, indicating it is not limited to a specific model. Notably, this problem has only appeared within the last 48 hours; before that, streaming functioned as expected.

Just echoing others in that we are seeing the same problem
Assistants api streaming :cross_mark:
Assistants api non streaming :check_mark: (i think)
Responses API streaming :check_mark:

On the playground it does not work on streaming mode, normal mode works. We cannot migrate our codebase to the responses API easily (Vector Store and RAG does not work the same between APIs I believe) so we are trying to replace the streaming with the normal response mode in the meantime just to keep services up. Deadline for migrating to responses was August 2026 but I guess it has to happen in a day …

Edit: We tried playing around with the max completion token param to no avail

same here… let’s hope they can resolve it and did not secretly push the sunset date forward.

Hello! This should be resolved shortly - thank you all for your patience.

Please keep us posted in the topic linked above, should you still experience issues going forward.