Responses API - reconnect to a streaming response

Hello, I am having trouble reconnecting to a streaming response (store=true) when using the Responses API.

The request looks like this:


{
  "model": "gpt-5-2025-08-07",
  "max_output_tokens": 128000,
  "input": [
    {
      "type": "message",
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "please write the longest story about pelicans"
        }
      ]
    }
  ],
  "store": true,
  "stream": true
}

This returns a stream of events, which contains a reference to the response, say resp_0XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8 (anonymized).

Then, assume that the connection interrupts, but I now can reconnect because it’s stored, streamable.

So I make a call to https://api.openai.com/v1/responses/resp_0XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8?stream=true

And I get this:

404 (Not Found): Response with id ‘resp_0XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8’ not found.

I would expect to get a stream of events instead of the 404, restarting from seq number 0 (as it’s not passed in the query).

What am I doing wrong?

Closing the connection on a normal API language post endpoint call in "stream":true will terminate the generation. The model generates only when there is a connection to consume events.

If you want a call that can continue against network adversity, you will want to use the “background”: “true” parameter on the Responses API.

This allows you to set the job in motion and receive an initial object with the job ID and first status.

Then you can poll the response ID object for its status field and see when the API call is completed, retrieving the (non-stream) output over the get endpoint.

You will have use a specific mode; you cannot pick up a “stream” where you left off that has terminated, nor can you re-connect to a stream:false call that continues running and billing (although it also may continue running enough that a retrievable response ID is formed).

A response ID is primarly for re-use as ‘previous_response_id’ in the next call, to continue the conversation memory.

This is a different use case, I’m referring with both stream=true and store=true, as per the request. This does not terminate the generation — in fact if I wait for long enough, I see the response appearing 10 minutes after Closing the connection.

I haven’t found an explanation of store=true, stream=true and background=? , and this appears to be the main article: http://platform.openai.com/docs/guides/background?lang=curl

I guess the process would be:

  • start with background=true, store=true and stream=true
  • connection closes immediately
  • then connect GET to the response with query param ?stream=true
  • if the status is not 404 or pending, events will start flowing from the right seq_num?