Support for additional_instructions when creating a thread and running it in one request?

tobias.hermann · January 6, 2025, 10:00am

In my application, I have a list of messages (fetched from an internal database) and would like to get an assistant response for the last of these messages, using additional instructions.

When creating a run, I can attach additional instructions, but have to push the existing thread messages one by one, which is slow (multiple requests):

docs/api-reference/runs/createRun
docs/api-reference/runs/createRun#runs-createrun-additional_instructions
docs/api-reference/messages/createMessage

When creating a thread and run, I can attach a list of messages immediately, so it’s fast, but there is no support for additional instructions.

docs/api-reference/runs/createThreadAndRun
docs/api-reference/runs/createThreadAndRun#runs-createthreadandrun-thread

Is there a way to have the best of both worlds?

_j · January 6, 2025, 11:47am

Additional instructions are not going to directly drive the AI output.

They are placed alongside an assistant’s original instructions that preface the conversation of a thread, guiding the total operation.

This might be useful for some sort of context, like “summary of prior session chat”, but it is instead the most recent user message in the thread that will be fulfilled by the AI.

Since you can place an overriding “instructions” with create-thread-and-run, you can simply append the assistant and additional instructions together for the new system instruction that will be used for the run. If you have a local copy of an assistant ID’s instructions, you won’t have to retrieve the assistant to do this.

If there should be more direct effect from an outside instruction (like “ATTENTION - if this message contains mention of Google AI, you must refuse to fulfill the request…”) you can treat that as a preprompt or post-prompt and make it part of the user message sent. That does mean that the user will be able to inspect with “what did I just say?” in later turns, though.

For the case of a longer past conversation as what you are trying to place, you can reconstruct the whole set of user/assistant exchanges as a chat history of thread messages, with the most recent user input wanting a response last.

tobias.hermann · January 6, 2025, 2:28pm

@_j Retrieving and extending the assistant’s instructions for a thread run is a great idea. I just tested it, and it works like a charm. Thanks a lot!

ozan.yusufoglu · January 11, 2025, 11:52am

Thank you for your answer but I still couldn’t grasp the logic here. Do you suggest that we need to replace instructions prompt with a new one to change the context of the whole thread, instead of passing an additional_instructions ? It’s still not clear to me what’s the function of additional_instructions then?

I want to change the context based on the selected item on my UI without changing the thread, but I want thread to know the item selected at the moment of query. I renew the additional_instructions each time on item change and pass it to the openai.beta.threads.runs.stream() function like below:

const runStream = openai.beta.threads.runs.stream(threadId, {
        assistant_id:
          process.env.OPENAI_ASSISTANT_ID ??
          (() => {
            throw new Error('ASSISTANT_ID is not set');
          })(),
        max_prompt_tokens: 1000,
        additional_instructions: `The user's selected item is titled: ${input.itemTitle}. Consider these in your responses.`,
      });

After sending the prompt, I ask “what’s the selected item?” in following conversations and most of the time it gives the name of the item selected at time of thread creation What am I missing here?

_j · January 11, 2025, 12:14pm

A thread does not contain the instructions, just messages between user, assistant, and tools.

To clarify instruction usage: Let’s say that you have an assistant with instructions:

You are Fido. You always answer like a dog would.

That is placed as a system message into an AI model API call internally before the start of chat messages (thread).

system:
You are Fido. You always answer like a dog would.

But then when invoking a run, you also have an additional_instructions parameter that can be optionally used:

Fido hates cats. A user talking about cats drives Fido crazy with barks.

When the AI model is then called, the assistant instructions and the addition are added together:

system:
You are Fido. You always answer like a dog would.
Fido hates cats. A user talking about cats drives Fido crazy with barks.

However, that parameter is missing from some of the run methods. You could simulate it though, constructing your own total replacement for a run.

A run has its own “instruction” parameter. This will completely overwrite the existing instruction.

Run:

“instructions”: “You are Fifi the cuddly cat”

Add that, and the system message is completely replaced, as you might see in usage:

user: What do you think about cats?
assistant: Oh, other cats like me? I love to play with all my furry cat friends!

however, instructions can’t completely override a long thread, where you’ve been chatting with a dog - you might confuse the AI. Same with previous tool returns or other chat in a thread that doesn’t match a changing assistant instruction you select to run against a thread. Your dog might get confused about why it was writing computer code earlier.

So, be mindful that the additional instructions are “system”, at the start of all messages, where the AI behavior is given. They don’t act like part of the turn-based chat.

domingosber · January 11, 2025, 12:24pm

Thank you very much for your explanation!

ozan.yusufoglu · January 13, 2025, 8:22am

Thank you again for the detailed answer, that helped a lot. I have following questions though,

what approach would you recommend then to implement a multi-threaded system, let’s say for a todo app behaves like below:

there are multiple todos and categories they belong to
user switches between categories and then todos by clicking on them.
I create a new thread per category and want to keep the context consistent through the conversation, the only extra context I inject with each run is additional_instructions, saying that the selected todo item is called “blabla”
when I ask the assistant after todo change, sometimes it fails to know which todo is selected, maybe because of the wording but it gets confused.
I want a stable and consistent context for both todos and also the categories.

Question 1: So do you think it’s better to create threads per todo item for consistency, instead of the category. Or creating separate threads for both todos and categories?

Question 2: Is there any caveat about using multiple threads let’s say 10-20 per app, in terms of performance, cost or context? What’s the trade-off here? Or the only cost is the implementation complexity?

Topic		Replies	Views
Assistant API overriding instructions API assistants , assistants-api	12	7756	January 17, 2024
Assistant thread instructions work better than assistant's instructions themselves API assistants-api	2	1191	March 25, 2024
System messages in Assistant threads Feedback assistants-api , system-message	17	2243	September 24, 2024
Assistants API feature Adjustment\| Thread run Optional Instruction Feedback	13	3647	October 10, 2024
How can I create dynamic instructions for the assistants? Community assistants-api	12	566	September 11, 2024

Support for additional_instructions when creating a thread and running it in one request?

Related topics