Assistants API → Responses API: this is not a 1:1 migration

Fartur · January 6, 2026, 9:14pm

One thing that’s tripping people up is assuming the Responses API is just “Assistants without threads.”
It’s a bigger shift than that.

The Assistants API managed:

Threads
Runs
Tool attachment lifecycle
Polling / orchestration

The Responses API flips the model:

Stateless calls
No server-side thread lifecycle
No tool_resources injection at request time
App owns memory, retries, orchestration, and state

In practice, this means:

Your application becomes the conversation manager
Tools like file_search resolve context implicitly
Realtime / streaming becomes a deployment concern, not a model concern

If you’re migrating, think less “API replacement” and more “architecture change.”

Curious how others are handling multi-turn state and file context post-migration.

rcasburn · January 6, 2026, 10:36pm

Doesn’t the responses API have the concept of conversations stored by OpenAI, not your app?

_j · January 7, 2026, 2:21am

Yes. While the Responses API can have an essay diatribe written about its failings, the AI-powered statements in the first post, from an account only posting AI-generated messages, are not true.

Two stateful ways of having OpenAI-hosted chat history state:

Store API request IDs (“store”:true), and then pass previous_response_id to continue a chain of conversation.
Use the Conversations API, creating an ID for your chat, and passing it in any Responses call as chat history, and the Conversation object is updated with the latest input and the AI response. You do not need to then retrieve the answer, unlike Assistants; it is serviced directly unless using “background”.

Thread parallel is addressed above; the only lack of “lifecycle” and lack of control is that any conversation state and input will be used to run up to the maximum input of any model without budget limit.
Tools: mostly true

you have to employ and pass vector store IDs that you manage yourself for a conversation, a user, or for a particular application. Not following someone else’s pattern is the developer ideal, but unfortunately the tool spec injection includes only one use for the AI, informing “the user uploaded files”. New fees per use.
You have a container ID for a code interpreter session, either automatic or by your creation on the containers API endpoint. They are ridiculously short-lived, a new fee for every re-creation, container contents are guarded and gated unless the AI correctly produces a citation in-response to allow you to retrieve a file, another symphony of suck no better than ChatGPT.

Your app owns x?

memory: server conversation state methods above
retries, orchestration? The Responses API server has an internal iterator for its hosted tools, and for cognitive failures, it is the AI that can retry tools. API connection issue retries are already part of the OpenAI SDK.
state? Another way of saying memory.

One of course must re-tool if you’ve been using easy-but-actually-harder Assistants and want to go to easy-but-ridiculous, where streaming is a concern because of connections being serviced needing generator gathering code of several dozen types of event to collect from (not whatever “deployment concern” the top post was saying)

You’ve got plenty of application state to manage about customers, their conversation sessions and metadata, their billing and subscriptions and over-use, their moderation strikes, the titles, shape, and expiry of their chats and resources with database IDs and corresponding objects that would be needed on the Responses API just like Assistants - where all the API “chat hosting” is redundant to what must be synchronized with all those API objects anyway.

“Doing it yourself”, and not locking yourself in to all customer resources behind an API bill and you maintaining a working scoped project, is the right answer.

Fartur · January 7, 2026, 3:00am

Yep — totally fair point.

There are server-side ways to persist continuity in Responses (via previous_response_id or Conversations), and that’s useful context.

What I was trying to highlight is less about whether state can exist, and more about who owns the lifecycle.

Compared to Assistants, Responses still shifts responsibility for:

memory strategy
orchestration across turns
tool usage boundaries
failure / retry semantics

So even with OpenAI-hosted conversation state, the architectural burden moves more clearly into the application layer.

I probably should’ve phrased it as “not assistant-style managed,” rather than “stateless” in the absolute sense.

Topic		Replies	Views
Transition from Assistants API to Responses API API assistants-api	13	3569	July 9, 2025
Moving to Responses API - questions Deprecations	1	168	October 14, 2025
Migrating function calls to Responses API Deprecations assistants-api	0	309	May 26, 2025
New: Responses API feature - Conversation state API (thread-like replacement) API	4	672	August 28, 2025
Assistants & Responses AI - Memoria API threads , assistants-api , responses-api	1	112	October 9, 2025

Assistants API → Responses API: this is not a 1:1 migration

Related topics