Preserve Full Assistants-API Functionality in the Upcoming Responses API

Hello OpenAI Team,

OpenAI’s Assistants API is unique because it gives us persistent, persona-rich assistants and stateful, first-class threads—two pillars that make large-scale conversational apps feasible.

  1. Persistent Assistants with Elaborate System Instructions
    • An assistant object stores an extensive system prompt once and re-uses it on every run.
    • This “permanent brain” eliminates repetitive prompting, enforces consistent behavior, and keeps token costs predictable.
    • Losing or downgrading this feature would force developers to externalize and resend large instruction blocks, adding latency and expense.
  2. First-Class, Scalable Threads
    • Threads isolate context per user and per conversation, automatically maintaining history and tool-call state.
    • Enterprises may run thousands of assistants, each spawning tens of thousands of parallel threads.
    • If threads become optional or shallow, developers would need to rebuild complex session-management layers—erasing one of the API’s greatest advantages.

Request: Please ensure that the new Responses API provides identical—or better—support for (1) persistent assist­ants with rich system instructions and (2) unlimited, fully stateful threads per assistant. These capabilities are foundational; without them, the platform’s value and developer confidence will drop sharply.

You will be glad to see that there are now ‘Prompts’ In the API, which even have versioning (Which the Assistants did not).

Responses API including the 'Last Response" are starting to get pretty close to Threads (including all function calling and code completion and even Deep Research available now)