Assistants API - Access to multiple assistants

Is there a way to have access to multiple assistants in the same thread? I want to be able to choose an assistant based on the context of the conversation.


You might still have more questions not answered yet - like if this does exactly what you think it might do.

1 Like

I wanted the model itself to dynamically switch between one assistant to another depending on the prompt, as opposed to updating it manually using the API.

The threads are managed separately from the runs. When you create a run you give it a thread id and an agent id. When you create the run, it’s a one-time thing, like a chat completion request … It will either append a message to the thread and then go to completed state or else go to requires action state for you to supply function results and then it’ll append a message and go to completed, see here. So, you create a fresh run, supplying assistant and thread id, each time you want an assistant response. And for user input you just add a message to the thread yourself. There’s no reason you couldn’t get a message from the user and add it to the thread, decide (e.g. through a secondary chat completion of assistant call) what assistant to use to answer it, create run with that assistant and have it add its answer (possibly requiring you to supply function call results first), get new messages from user, decide to use different assistant etc, and have them working on the same thread. You could dont even have to wait for user input, you could create two runs in a row with different assistants. I don’t believe the actual chat completion calls the Run does in the background include assistant ids, so it’ll appear to the the assistant current answering as if it supplied all previous assistant answers, which could throw it off to some degree e.g. if the assistant write in very different styles … But otherwise it’ll work just fine.


I had a similar question.

I’ve built crude “roundtables” of AI councils but the idea of having Assistants with a specific set of skills like Liam Neeson?


Need that!


“Depending on the prompt” means another AI has to first classify the best place to send a conversation and its context off to.

Like I wrote a classifier to pick a temperature for the language model to be called with.

This may or may not be what you are looking for. I have two assistants using the same thread. One is a fiction writer the other is a critic. The writer creates a first draft of chapter 1, then has the critic provide feedback, the writer then rewrites the first draft. It may not be what you are looking for, but perhaps it moves you closer to what you are after.

import time

from openai import OpenAI

# gets API Key from environment variable OPENAI_API_KEY
client = OpenAI(
    api_key="your API key",

assistantWriter = client.beta.assistants.create(
    instructions="You are an expert writer of fictional stories",
assistantCritic = client.beta.assistants.create(
    instructions="You are an expert critic of fictional stories. You provide positive and constructive feedback",

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(,
    content="""Write a single chapter about a young girl that meets a centaur in the forest. 
    Describe how she feels, what she sees, hears and even smells""",

def runAssistant(assistant_id,thread_id,user_instructions):
    run = client.beta.threads.runs.create(

    while True:
        run = client.beta.threads.runs.retrieve(,

        if run.status == "completed":
            print("This run has completed!"

            print("in progress...")
# Run the Writer Assistant to create a first draft                      
runAssistant(,,"Write the first chapter")
# Run the Critic Assistant to provide feedback 
runAssistant(,,"""Provide constructive feedback to what 
the Writer assistant has written""")
# Have the Writer Assistant rewrite the first chapter based on the feedback from the Critic        
runAssistant(,,"""Using the feedback from the Critic Assistant 
rewrite the first chapter""")

# Show the final results 

messages = client.beta.threads.messages.list(

for thread_message in
    # Iterate over the 'content' attribute of the ThreadMessage, which is a list
    for content_item in thread_message.content:
        # Assuming content_item is a MessageContentText object with a 'text' attribute
        # and that 'text' has a 'value' attribute, print it
        print(content_item.text.value) or paste code here

Is there any advantage to creating the assistants in the playground vs in code?

I’d say advantages of code are:

  1. Version control
  2. Harder to make mistakes
  3. Easier to scale
  4. Easier to add tools/functions.

Aside from that though they re doing the same thing. We still create them using the Playground as it’s quicker for us and don’t need to create many!

An approach you may want to consider is the addition of a “Facilitator” assistant. The assistant:

  1. Has clear definition of roles and expertise of each other assistant in rhe ‘group chat’.
  2. After a message is added to the thread (multiple assistants can run on the same thread) the “Facilitator” is run.
  3. The “Facilitator” reads the last message added, notes the message_id, and determines which assistant should respond. A message is added to the thread that includes the message_id and respondent assistant.
  4. Code calls a run of the respondent assistant to read message_id, and respond.
  5. The loop begiins again when the Facilitator read the last message posted.

The downside of the approach is the Facilitator may become a performance bottleneck, if the thread activity is high.

this is interesting. but i believe the ideal scenario is having the 2 assistants assigned to the same thread (something like assistant_id = [assistantWriter, assistantCritic]) and the user role guides their actions via the messages.create endpoint. (not through the run instructions).

there’s a mention of this in the Assistant API notebook in the cookbook but it isn’t linked to anywhere:

There’s a few sections we didn’t cover for the sake of brevity, so here’s a few resources to explore further:

  • Annotations: parsing file citations
  • Files: Thread scoped vs Assistant scoped
  • Parallel Function Calls: calling multiple tools in a single Step
  • Multi-Assistant Thread Runs: single Thread with Messages from multiple Assistants
  • Streaming: coming soon!

Now go off and build something ama[zing]

Hey, this code is very helpful thanks. I have a question if you dont mind. Sorry bout the light weight necro

Whats with the message and the run, the message asks a big question the run asks a subset? It just seems like a weird extra step and redundant. What if I just want to ask 1 question, like how do I sort a list in python? Would I do like instructions=“Your a professional developer” then thread =“You answer code questions” and then in the run “How do I sort a list in python” ??

Or maybe better, is thread supposed to be a more general version of the questions asked in run?