Is there a feature matrix for Assistance and Response APIs?

Since the Assistance will be retired at some point, it would be nice to have a feature matrix to see what’s implemented already in the new api

Something like this

Reseach thing did the summary. I think its pretty accurate?

Feature Assistants API Responses API Notes
Thread & Conversation History Yes – Built-in threads automatically store and manage conversation history (Assistants API v2 FAQ / OpenAI Help Center). Developers do not need to resend past messages. Partial – No dedicated “thread” objects yet (state is not auto-stored). Supports chaining via previous_response_id to carry context (New tools for building agents / OpenAI) (Azure OpenAI Responses API - Azure OpenAI / Microsoft Learn). Assistants API threads persist chat history and truncate when needed (Assistants API v2 FAQ / OpenAI Help Center). The Responses API currently requires the client to link to the last response ID for context carry-over, mimicking a threaded conversation. Full thread-like persistence is planned but not yet native (New tools for building agents / OpenAI).
Tool Calling & Function Execution Yes – Supports calling tools (incl. OpenAI-hosted Code Interpreter, File Search) and improved function calling for custom tools (Assistants API v2 FAQ / OpenAI Help Center). YesFully supports tool use and function calling during a response (includes built-in tools like web search, file search, computer use) (New tools for building agents / OpenAI) (New tools for building agents / OpenAI). Both APIs allow the model to invoke tools or developer-defined functions mid-response. The Responses API was designed to combine chat completion simplicity with Assistants’ tool-use capabilities (New tools for building agents / OpenAI). It can intermix multiple tool calls and model turns in one request (New tools for building agents / OpenAI). (Note: See “Code Interpreter” row for that specific tool.)
File Attachments & Retrieval Yes – Files can be uploaded and stored (in a vector store) for the assistant to retrieve context via the file_search tool (Assistants API v2 FAQ / OpenAI Help Center). YesFully supports retrieval from uploaded files via the File Search tool (New tools for building agents / OpenAI). (Developer must specify which file store to use.) Assistants API let you attach files to an assistant’s knowledge base (with one vector store per assistant/thread) for retrieval. In the Responses API, file search is available as a built-in tool to query a knowledge base of previously uploaded files (File search - OpenAI API). Developers set up a vector store and pass its ID in the request (e.g. tools: [{type: "file_search", vector_store_ids: ["<id>"]}]) to retrieve relevant content.
Streaming Responses Yes – Supported (could stream the assistant’s reply). Streaming was available in beta, though required specific setup. YesFully supported, with improved event streaming. Responses API streams structured events (tool invocations, partial messages) in sequence (New tools for building agents / OpenAI). Both APIs support streaming token-by-token outputs. However, Responses API was built with streaming in mind – it provides semantic events (e.g. interim tool results and final answer chunks) for easier real-time handling (New tools for building agents / OpenAI). This design makes streaming more intuitive in Responses (OpenAI even recommends using Responses API for streaming use cases).
System Messages & Instructions Yes – Assistants have persistent system instructions (defining behavior/rules) set when creating the assistant (up to ~256k chars) (Assistants API v2 FAQ / OpenAI Help Center). These act like a constant system prompt. YesFully supports system-level instructions per request (via a system message in the input), but no persistent assistant profile yet. The Assistants API allows defining a fixed system persona or instructions for each assistant that automatically apply to all its threads (Assistants API v2 FAQ / OpenAI Help Center). In the Responses API, you can include system messages or directives in the request (similar to Chat Completions API) to guide the model’s behavior. However, since there’s no saved Assistant object, these instructions are not stored by the API between calls – they must be provided (or linked via context) in each session.
Message Formatting (Markdown) Yes – Model responses can include Markdown formatting (lists, code blocks, etc.) as directed. The content is returned as text which may contain Markdown syntax. YesFully supported. Models respond in text (Markdown or other formats) just like in Chat Completions. There is no difference in Markdown support: both APIs rely on the model (e.g. GPT-4) to format its reply. If instructed to use Markdown (for tables, code, etc.), the assistant’s reply will contain Markdown. The Responses API is essentially a superset of the Chat Completions API (New tools for building agents / OpenAI), so it inherits the same formatting capabilities.
Role Usage & Continuity Yes – Implicit role structure. The assistant’s system instructions define the system role; each new user query is handled as a user message in a thread, and the assistant reply is stored as an assistant message. Continuity is automatic via thread context. Partial – Uses the same role schema as Chat (system/user/assistant/function) for messages (New tools for building agents / OpenAI), but conversation continuity requires using the previous_response_id mechanism (no automatic thread memory) (Azure OpenAI Responses API - Azure OpenAI / Microsoft Learn). In the Assistants API, developers didn’t need to manually construct role-labeled message lists for each turn – the system and assistant roles were managed under the hood (with the assistant’s persona persisting, and history auto-included). The Responses API, by contrast, expects input in role-based format (you can provide a list of messages with roles, or a single user prompt) and to continue a conversation you pass the last response’s ID. This gives similar continuity, but it’s not as transparent as the Assistants API’s built-in thread memory.
Code Interpreter Integration Yes – Fully supported. Code Interpreter was a built-in tool, allowing the assistant to execute Python code and return results (including file outputs) in a sandboxed environment (Assistants API v2 FAQ / OpenAI Help Center). NoMissing (as of launch). The Responses API does not yet include the Code Interpreter tool (Introducing the Responses API - Announcements - OpenAI Developer Community). (Planned to be added for parity.) The Assistants API could leverage Code Interpreter (e.g. for data analysis, running code, or generating charts). This tool let the model run code and produce downloadable files or images. In the new Responses API, Code Interpreter is not currently available (Introducing the Responses API - Announcements - OpenAI Developer Community). OpenAI has stated that support for Code Interpreter will be introduced to the Responses API to reach feature parity (Introducing the Responses API - Announcements - OpenAI Developer Community). In the meantime, the computer_use tool in Responses offers some limited code/OS interaction capabilities, but not the full programming environment that Code Interpreter provides.
File Uploads & Persistent Storage Yes – Supports uploading files (up to 512 MB each) and persisting them in a vector store associated with an assistant/thread (Assistants API v2 FAQ / OpenAI Help Center). The assistant can repeatedly use these files via File Search across sessions. Persistent storage limit of 100 GB per project (Assistants API v2 FAQ / OpenAI Help Center). YesFully supported. Files can be uploaded to OpenAI and indexed for retrieval with the Responses API’s tools. Data stored on OpenAI for this purpose is retained for the developer (not used for training by default) (New tools for building agents / OpenAI). Both APIs allow persistent knowledge bases. In Assistants API, you would create a vector store for an assistant, upload files to it, and the assistant could use that data any time (Assistants API v2 FAQ / OpenAI Help Center) (Assistants API v2 FAQ / OpenAI Help Center). In Responses API, the concept is similar: you upload files (creating a vector store index) and specify that store when using the File Search tool. The main difference is scope – in Assistants API the vector store was tied to an assistant or thread, whereas in Responses API you explicitly reference the relevant store in each call. OpenAI retains these files/embeddings on their servers for your project, enabling persistent retrieval across calls (with assurances that they are not training on your data) (New tools for building agents / OpenAI).

Sources: The comparison above is based on official OpenAI documentation – including the Assistants API FAQ and beta documentation, and the OpenAI announcement/blog for the new Responses API (Assistants API v2 FAQ / OpenAI Help Center) (New tools for building agents / OpenAI) (New tools for building agents / OpenAI) (Introducing the Responses API - Announcements - OpenAI Developer Community) – to ensure accuracy of feature details. Each feature is described with respect to how it was implemented in the Assistants API and whether the same functionality exists in the Responses API (fully, partially, or not yet), along with nuanced differences in design or usage.

2 Likes