Assistants API Streaming Connection Closure Issue

Issue Summary

The Assistants API streaming endpoint is prematurely closing HTTP connections before sending the complete chunked transfer encoding termination sequence. This results in RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read) errors.

Severity

HIGH — This issue is affecting production systems and causing user-facing errors. The problem started occurring on November 26, 2025, and is happening consistently across multiple requests.

Environment Details

  • API: Assistants API (beta)
  • Endpoint: client.beta.threads.runs.create() with stream=True
  • Library: openai-python (latest version using AsyncOpenAI)
  • Python Version: 3.12
  • HTTP Client: httpx (via openai-python)
  • Assistant ID: asst_neZIwamXRFmWbdtnxq1LImWl
  • Model: GPT-4 (specified via override in run creation)

Issue Description

When streaming responses from the Assistants API, the connection is being closed by the server before the chunked transfer encoding is properly terminated. This happens intermittently but consistently, affecting nearly 100% of requests since the morning of November 26, 2025.

Error Details

RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

This error occurs in the httpx library when the HTTP/1.1 chunked transfer encoding does not receive the proper termination sequence (0\r\n\r\n).

Reproduction Steps

  1. Create a thread: client.beta.threads.create()
  2. Add a message to the thread: client.beta.threads.messages.create()
  3. Create a streaming run with file search enabled:
stream = await client.beta.threads.runs.create(
    thread_id=thread_id,
    assistant_id=assistant_id,
    model="gpt-4.1",
    stream=True,
    tools=[file_search_tool],
    instructions="..."
)
  1. Iterate through events: async for event in stream:
  2. Result: Stream processes events successfully (messages, tool calls, etc.) but then raises the error before completing

Key Observations

1. Timing Pattern

  • Error occurs after successful processing of run steps
  • Error occurs after ThreadRunStepCompleted events
  • Error occurs during or after file search tool calls complete
  • The assistant’s response is generated successfully on OpenAI’s side

2. Evidence of Successful Completion

When we poll the run status after the error occurs:

run_status = await client.beta.threads.runs.retrieve(
    thread_id=thread_id,
    run_id=run_id
)
# Returns: status='completed'

The messages are fully generated and retrievable:

messages = await client.beta.threads.messages.list(
    thread_id=thread_id
)
# Successfully returns complete assistant message with citations

3. Consistency

  • Happens on ~100% of streaming requests since Nov 26, 2025
  • Occurs across different threads and assistants
  • Occurs regardless of message complexity
  • Affects both initial messages and follow-up messages in a conversation

Technical Analysis

Root Cause

The OpenAI Assistants API server appears to be closing the HTTP connection before properly terminating the chunked transfer encoding. This suggests:

  1. Server-side premature connection closure — The server closes the socket after sending the last data chunk but before sending the termination sequence
  2. Potential timeout or resource cleanup issue — The connection may be closed as part of cleanup before streaming is properly finalized
  3. Load balancer or proxy issue — An intermediary service may be closing connections prematurely

Why This Matters

  • The HTTP/1.1 chunked transfer encoding specification (RFC 7230) requires a final chunk of size 0 followed by CRLF (0\r\n\r\n) to signal the end of the message body
  • Without this termination sequence, HTTP clients (such as httpx) correctly identify this as a protocol error
  • This is not a client library bug—the server behavior violates the HTTP specification

Expected Behavior

The streaming endpoint should:

  1. Send all data chunks for the streaming events
  2. Send the final 0\r\n\r\n sequence to properly terminate the chunked transfer
  3. Close the connection cleanly

Actual Behavior

The streaming endpoint:

  1. Sends all data chunks for the streaming events
  2. Closes the connection abruptly
  3. Never sends the termination sequence

Impact Assessment

  • User Experience: Degraded — requires fallback polling, adding 1–15 seconds of latency
  • System Reliability: Affected — requires complex error handling and recovery logic
  • Production Status: Mitigated with a workaround, but not resolved
  • Business Impact: HIGH — affects real-time user interactions in a production system

Request

Please investigate and fix the Assistants API streaming endpoint so that it properly terminates HTTP chunked transfer encoding before closing connections. This appears to be a server-side issue that began on November 26, 2025.

Additional Context

We are aware that the Assistants API is scheduled to be sunset later in 2026. However, this is a critical production issue affecting current users. A fix or official acknowledgment would help us determine whether to:

  1. Wait for a fix
  2. Migrate to the Chat Completions API earlier than planned
  3. Continue relying on our workaround
3 Likes

The same is happening here

1 Like

Thank you for this detailed summary. We are experiencing the same thing.

1 Like

@caseyjenks @mczi Thanks for the reply! Glad to hear were not the only ones. Let’s hope OpenAI notices and fixes this. It seems to be affecting a lot of people. @nikunj

We are encountering the same streaming issue reported here, in web applications and when using the Assistants Playground on the OpenAI platform.

Streamed responses tend to cut off prematurely if the request lasts longer than about 15 seconds, whereas streams complete successfully for responses under 15 seconds.

This happens consistently across several different models we have tested, indicating it is not limited to a specific model. Notably, this problem has only appeared within the last 48 hours; before that, streaming functioned as expected.

A fix is being deployed at this moment.
Please keep us posted in the linked topic, should you still run into issues.

2 Likes