Estou desenvolvendo automações no n8n que dependem fortemente da Assistants API, especialmente do sistema de threads (para manter memória e contexto de conversa entre mensagens).
Li na documentação que a Assistants API será descontinuada em agosto de 2026, e que a Responses API vai substituí-la.
Pelo que entendi, agora teremos objetos como Prompts (que guardam configurações de modelo e tools) e Conversations, que parecem ser o novo equivalente às threads.
Minhas dúvidas são:
Já existe alguma forma de criar e gerenciar Conversations na Responses API (como substituto das Threads)?
A Responses API terá memória própria igual os Threads da Assistsnr API? Vi na documentação uma menção de que aos antigos Threads irão se chamar “Conversations”
O dashboard de Prompts já está disponível para todos os usuários?
Existe alguma migração oficial ou exemplo prático para quem usava Threads + Runs na Assistants API, principalmente em automações com ferramentas externas (como n8n)?
Seria ótimo ter uma visão clara de como migrar fluxos que dependem de threads e runs, mantendo o mesmo comportamento de memória.
I’m sure there are many here that can provide guidance and the destination links. I’ll give the quick version.
Conversations is its own endpoint, but is only useful in connection with Responses.
A conversation object is a chat history container. You can pre-load user messages there to run them like was done on Assistants, but it is not a requirement and not really useful to do so.
2a. The expected pattern would be that you use the “input” field of a Responses API request body to provide the new user role message, the “instructions” field of an API request to place the “system” guidance, and then conversation is simply an ID that you provide that will store a history, with that newest user message and AI assistant response appended to it for the next chat turn with ‘memory’.
2b. There are no controls or limits on the maximum ‘conversation’ that will run - it will go up to the limit of the model, and then start breaking the cache every turn.
Prompts is not mandatory, or even all that useful. You are indeed required to go into the platform.openai.com site, as owner, to ‘chat’, to then make a playground shape via a frustrating UI, and save to get an ID that you can reuse. Then with fixed tools and instruction and some of the settings. All stuff you can send in an API request. Using the instructions field or prompts or Responses itself ultimately saves you little in bandwidth, because it is all echoed back at you.
Migration guide - is another version of “how to use”, but there are no migration tools for portability of data.
Conclusion: • Use Chat Completions. • Manage your own conversation. • Implement your own functions instead of tools like file_search (downtimes for days, even within the last 24 hours, instructions injected that make the AI think “files” came from a user), web_search (countless internal tool loops possible, only to get the output run against OpenAI’s own system message to only produce their search engine product), code interpreter (pay per use, pay even when not employed if you want to self-manage, total session data loss every 20 minutes of inactivity that cannot continue a chat), or MCP (a slow non-compliant way of only doing ‘search’ over the internet back to your own code anyway). • Reject ‘reasoning summaries’ that are gated behind ID verification anyway, and • reject the idea of some new models only being offered on Responses as a motivation to make non-portable server-dependent code by simply not using them. • Have high performance without an interloping service missing key parameters.