Needed advice for migrating into the Responses api

pawelgc · September 16, 2025, 9:44pm

Hello, we’re currently looking into migrating from the depricated Assistants API to the new Responses API.

Old implementation

For our use case we made use of the creation of an Assistant via the API to make a new Assistant for each of our clients. This also made it possible to keep instructions given to an Assistant on our platform for each client.

The problem we’re facing

With the new Responses API, Prompts are the new Assistants. The problem with Prompts is that its not possible to CRUD them via an API.

Our solution

What we came up with after reading the guide, is using a single Prompt that will have variables used by each client (user data vars only), and to pass the instructions via a Response Object.

Questions:

Is there a better way to solving the problem we are facing? (Not being able to create Prompts dynamically) And is our solution a good idea?
Is there a set limit to the Response Object instructions property?

edwinarbus · September 16, 2025, 9:48pm

Have you seen the Conversations API? https://platform.openai.com/docs/guides/conversation-state?api-mode=responses

_j · September 16, 2025, 10:00pm

using “prompts”: completely optional (with issues because you cannot retrieve the contents to run “includes” parameters on tools correctly, or alter via API);
using “variables”: quite optional, a cache-breaking workaround to those unalterable prompts by API;
using “conversations”: completely optional (with issues because of near-unlimited input costs, you cannot clean up an expired code interpreter container and issues with tools vs reasoning, and storing unwanted response state); not meant for an initial instruction.
using “previous_response_id”: completely optional, an unmanageable state you cannot migrate out of, but from which you can branch.
even using “instructions” as an API parameter: completely optional.

You can set “store”: “false” and construct every turn with initial system/developer role messages as “instructions”, previous chat history messages you’ve retained yourself and manage in length, and then newest input, capable of being shaped by near-placement of further temporary messages, and then even every sequence of function call yourself, per API call.

“instructions” is a per-turn every-turn field inserted before any “input” messages you pass. It comes before previous input by “conversation” or “previous response ID” (where when employing the latter two, instructions are near-mandatory if you allow conversations to grow longer than model input and first messages are discarded). It is most like the instructions field of an “Assistant”, except you furnish the text each time instead of the assistant ID each time.

So solving the problems with the offered solutions is not using these turnkey proprietary generics.

pawelgc · September 17, 2025, 7:38am

Yes, we looked into this and are considering using the Responses API together with the Conversations API.

Our main concern now is the maximum length of the instructions property on the response.

pawelgc · September 17, 2025, 7:40am

Thanks for the detailed breakdown.
We’re mostly looking for a practical pattern to pass per-client instructions each turn. From your experience, is there any hidden limit or best practice for how long the instructions field can be?

_j · September 17, 2025, 8:22am

That is something you can test.

I’ve got 20MB of dereferenced OpenAPI specification over here - lets see how much of it can be ingested. Encode half-a-million tokens for a million-token model perhaps?

import json, httpx, tiktoken; from pathlib import Path

INPUT_FILENAME: str = "openai.documented.yml"
MAX_TOKENS: int = 500000

def get_api_key_headers():
    import os
    return {"Authorization": f"Bearer {os.environ.get("OPENAI_API_KEY")}"}

try:
  base_dir = Path(__file__).parent
except NameError:
  base_dir = Path.cwd()

text_path = base_dir / INPUT_FILENAME
text: str = text_path.read_text(encoding="utf-8")
enc = tiktoken.get_encoding("o200k_base")
tokens: list[int] = enc.encode(text, disallowed_special=())
truncated_tokens: list[int] = tokens[:MAX_TOKENS]
doc: str = enc.decode(truncated_tokens)

payload = {
    "model": "gpt-4.1-nano",
    "instructions": f"Reference documentation retrieval: {doc}",
    "input": "What is the topic of the documentation?",
    "store": False,
    "max_output_tokens": 200,
}

try:
    response = httpx.post(
        "https://api.openai.com/v1/responses",
        json=payload,
        headers={**get_api_key_headers()},
        timeout=300,
    )
    response.raise_for_status()
    assistant_texts = [
        content["text"]
        for output in response.json().get("output", [])
        for content in output.get("content", [])
        if content.get("type") == "output_text" and "text" in content
    ]
    print("\n---\n\nCollected response text:\n" + str(assistant_texts))
    print(response.json().get("usage", {}))
except:
    print(
        response.status_code,
        json.loads(response.content.decode())["error"]["message"]
    )

Nope, OpenAI is counting characters.

400 Invalid 'instructions': string too long. Expected a string with maximum length 1048576, but got a string with length 3896434 instead.

So if you’ve got under a megabyte of “instructions” for the AI to follow, <1MB of network traffic for each turn, you should be okay. The rest will have to be in more messages.

Topic		Replies	Views
Could you let me know what the character limit is for OpenAI instructions? Prompting assistants-api	1	491	April 23, 2025
MAX ? Number of tokens/words inside the instructions (response API) API api , responses	1	106	July 16, 2025
Assistants API to Responses API migration - API-only assistant creation and system instructions Deprecations responses	3	257	January 12, 2026
Realtime API instructions length API gpt-4 , chatgpt , realtime	8	1480	May 9, 2025
4096 response limit vs 128 000 context window API	11	15437	February 6, 2025

Needed advice for migrating into the Responses api

Old implementation

The problem we’re facing

Our solution

Questions:

Related topics