Issue Summary
The Assistants API streaming endpoint is prematurely closing HTTP connections before sending the complete chunked transfer encoding termination sequence. This results in RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read) errors.
Severity
HIGH — This issue is affecting production systems and causing user-facing errors. The problem started occurring on November 26, 2025, and is happening consistently across multiple requests.
Environment Details
- API: Assistants API (beta)
- Endpoint:
client.beta.threads.runs.create()withstream=True - Library:
openai-python(latest version using AsyncOpenAI) - Python Version: 3.12
- HTTP Client: httpx (via openai-python)
- Assistant ID:
asst_neZIwamXRFmWbdtnxq1LImWl - Model: GPT-4 (specified via override in run creation)
Issue Description
When streaming responses from the Assistants API, the connection is being closed by the server before the chunked transfer encoding is properly terminated. This happens intermittently but consistently, affecting nearly 100% of requests since the morning of November 26, 2025.
Error Details
RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
This error occurs in the httpx library when the HTTP/1.1 chunked transfer encoding does not receive the proper termination sequence (0\r\n\r\n).
Reproduction Steps
- Create a thread:
client.beta.threads.create() - Add a message to the thread:
client.beta.threads.messages.create() - Create a streaming run with file search enabled:
stream = await client.beta.threads.runs.create(
thread_id=thread_id,
assistant_id=assistant_id,
model="gpt-4.1",
stream=True,
tools=[file_search_tool],
instructions="..."
)
- Iterate through events:
async for event in stream: - Result: Stream processes events successfully (messages, tool calls, etc.) but then raises the error before completing
Key Observations
1. Timing Pattern
- Error occurs after successful processing of run steps
- Error occurs after
ThreadRunStepCompletedevents - Error occurs during or after file search tool calls complete
- The assistant’s response is generated successfully on OpenAI’s side
2. Evidence of Successful Completion
When we poll the run status after the error occurs:
run_status = await client.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run_id
)
# Returns: status='completed'
The messages are fully generated and retrievable:
messages = await client.beta.threads.messages.list(
thread_id=thread_id
)
# Successfully returns complete assistant message with citations
3. Consistency
- Happens on ~100% of streaming requests since Nov 26, 2025
- Occurs across different threads and assistants
- Occurs regardless of message complexity
- Affects both initial messages and follow-up messages in a conversation
Technical Analysis
Root Cause
The OpenAI Assistants API server appears to be closing the HTTP connection before properly terminating the chunked transfer encoding. This suggests:
- Server-side premature connection closure — The server closes the socket after sending the last data chunk but before sending the termination sequence
- Potential timeout or resource cleanup issue — The connection may be closed as part of cleanup before streaming is properly finalized
- Load balancer or proxy issue — An intermediary service may be closing connections prematurely
Why This Matters
- The HTTP/1.1 chunked transfer encoding specification (RFC 7230) requires a final chunk of size
0followed by CRLF (0\r\n\r\n) to signal the end of the message body - Without this termination sequence, HTTP clients (such as httpx) correctly identify this as a protocol error
- This is not a client library bug—the server behavior violates the HTTP specification
Expected Behavior
The streaming endpoint should:
- Send all data chunks for the streaming events
- Send the final
0\r\n\r\nsequence to properly terminate the chunked transfer - Close the connection cleanly
Actual Behavior
The streaming endpoint:
- Sends all data chunks for the streaming events
- Closes the connection abruptly
- Never sends the termination sequence
Impact Assessment
- User Experience: Degraded — requires fallback polling, adding 1–15 seconds of latency
- System Reliability: Affected — requires complex error handling and recovery logic
- Production Status: Mitigated with a workaround, but not resolved
- Business Impact: HIGH — affects real-time user interactions in a production system
Request
Please investigate and fix the Assistants API streaming endpoint so that it properly terminates HTTP chunked transfer encoding before closing connections. This appears to be a server-side issue that began on November 26, 2025.
Additional Context
We are aware that the Assistants API is scheduled to be sunset later in 2026. However, this is a critical production issue affecting current users. A fix or official acknowledgment would help us determine whether to:
- Wait for a fix
- Migrate to the Chat Completions API earlier than planned
- Continue relying on our workaround