Error Internal Server Error: The response ended prematurely. (ResponseEnded)

Hello,

I am using Assistants API v1 together with the gpt-4.1-mini-2025-04-14 model in streaming mode, and I’m encountering a recurring issue: the JSON response generated by the assistant is cut off before completion.

Context

  • Backend: C# / .NET 8

  • Using Assistants API v1 (Assistants, Threads, Messages, Runs)

  • Files involved:

    • Teams transcript (.docx)

    • Participants list (CSV)

  • Using file-based search (file_search)

  • The assistant must output a long, fully-structured JSON

  • Streaming enabled (stream=true)

Issue

Randomly, the streamed response:

  1. stops before completion,

  2. returns an incomplete JSON,

  3. ends in the middle of a string,

  4. triggers a deserialization error on the C# side.

Typical example:

"digital twi

Sometimes I also get:

Internal Server Error: The response ended prematurely. (ResponseEnded)

This happens only in streaming mode.

Observations

  • max_tokens is set to null

  • JSON response enforced with response_format: { type: "json_object" }

  • In non-streaming mode, responses are always complete

  • Files are correctly uploaded and indexed

  • Transcript is about 40–50 minutes long, well within model limits

  • The issue occurs intermittently

My question

Is this a known issue with Assistants API v1 when streaming responses?
Is there a recommended workaround, configuration, or best practice to guarantee full JSON completion when using gpt-4.1-mini-2025-04-14?

Thank you for your help.

4 Likes

I’m having the exact same problem and response status. It started last night (November 25, 2025) (GMT-5) and is still happening today (November 26, 2025). I still haven’t been able to figure out what’s going on

Hello! This should be resolved shortly - thank you all for your patience.

Please keep us posted in the topic linked above, should you still experience issues going forward!