Assistants API - Access to multiple assistants

Sounds like a very nice experiment. I am looking forward to your results.

Here is how I approached this. The messages are segregated by assistant.id. To make this clearer when you see the results I changed the code shown earlier to show the results as follows: I am keeping this example simple, but as you can see the thread simply contains the conversation between assistants and any humans that are in the loop. The thread mirrors the conversation window you have when you are working with ChatGPT.

# Show the final results 
                  
messages = client.beta.threads.messages.list(thread_id=thread.id)

# Save the text of the messages so that they can be printed in reverse order
messageStore = []

for message in messages:
    
    if message.assistant_id == assistantCritic.id:
        assistantName = "Critic: "
    elif message.assistant_id == assistantWriter.id:
        assistantName = "Writer: "
        
    messageStore.append(assistantName+message.content[0].text.value)

#To make it more readable print the messages in reversed order 

for message in reversed(messageStore):
    print(message)  









3 Likes

I have a similar use case, I have 2 assistants, both doing separate tasks and I do query intent classification based on user query and their chat history and based on the intent of the query an assistant is selected.
Both assistants have different custom functions, and function calls also get added to message history.
My question is what will be the best way to segregate the chat history, should it be a single thread with both assistant messages and all user queries in it, or should I keep it separate?

2 Likes

Good question.

I have not experimented with this, but if there is a requirement to segregate conversations and thereby context, then separate thread(s) would seem to be the way to go.

The time you want the common thread amongst assistants is when you want those experts to collaborate on task, but what I like about the assistants API framework is that an assistant is not bound to only one thread in its lifetime. It’s the run object that binds an assistant to a single thread at any given time.

I can imagine the scenario where you have a panel of ā€œexpertsā€ (assistants). Let’s say these assistants represent designers, product development, marketing etc.

In a product development life cycle, the designers might collaborate on one thread, separate from product development and marketing which would have their own thread. Later in that same product development lifecycle, designers, product development and marketing could take the outcome from their individual conversations and collaborate on a new thread.

It suggests that better metaphors for this use case might be to think of threads as being meeting rooms, run objects as meetings with a specific agenda.

4 Likes

So I’ve been playing around with panels of Assistants, and the shared thread approach works. From my experiments it works best if you do the following in the instructions for each Assistant:

  1. Tell the Assistant its name. They don’t have access to the ā€œnameā€ parameter for assistants.create().
  2. Tell the Assistant to always respond with "$name: " so we know who is doing the talking.
  3. Optional, but it can be helpful to tell the Assistants the names of the other Assistants, as well as some information about their purpose or specialty.

If you want to select the order the Assistants respond, this is all you need to do. You can do a simple round robin, or you can use some other logic.

You might instead want the Assistants to determine who speaks next. You could do this in a free-flowing conversation by asking each Assistant who should respond next, or you can have a moderator Assistant who chooses who speaks next. The trick here is parsing the reply to determine who should respond next. I had trouble getting this to work reliably, so I instead created a bogus function with one parameter, Name, and in the instructions told the Assistants they should use this function to determine who goes next. When the Run indicates a function call is waiting, my code notes the Name and Prompt and then returns an empty function completion to the Run. Then the code has the named agent respond next.

There are lots of variations on this approach. The key thing is to make sure the Assistants have a thread where it is clear who said what.

An alternative approach is to have a separate thread for each Assistant, and use user messages to tell each Assistant what happened. For example imagine Alice and Bob, who are both riddle master Assistants. The chat begins by the user prompting Alice to give Bob a riddle. Alice responds on its own thread, and the code takes the response and sends it as a user message on Bob’s thread. And so forth. So after a few rounds, you’ll have something like this:

Alice’s thread

  1. user: Give Bob a riddle
  2. assistant: Bob, here’s a riddle. I speak without a mouth and hear without ears. I have no body, but I come alive with the wind. What am I?
  3. user: Bob said ā€œIt’s an echo.ā€
  4. assistant: Bob is correct. Now I want him to give me a riddle.

Bob’s thread

  1. user: Alice said: ā€œBob, here’s a riddle. I speak without a mouth and hear without ears. I have no body, but I come alive with the wind. What am I?ā€
  2. assistant: An echo.
  3. user: Alice said: ā€œBob is correct. Now I want him to give me a riddle.ā€

I haven’t come up with any great reason to use this model, however.

One major problem, I faced with implementing a similar solution is latency, this back-and-forth communication and custom function calls take up so much time.
I even implemented a similar flow with chat completion so that I could get streaming, but still, the time to the first token was around 20 seconds.

1 Like

After spending some time with this API, I think that the Assistants API is intended for a class of problems that are more complex than what is normally done through a Chat API. Problems that are best solved through a collaboration of specialized Assistants. It’s one thing to ask GPT to write a chapter in a book, or a code fragment for a program, it’s another thing to ask GPT to write a book or a software product. Chat Completion APIs are for the former, Assistants API are for the latter. Using the Assistants API about carefully defining a work product like a book, or software product. Determining what kinds of experts you want to collaborate to deliver that work product, what your completion criteria is, then letting it run. It might take an hour to write the first draft of that book, but that is arguably better than wrangling humans together to do the same task.

1 Like

I’m wondering if this can be done within the ChatGPT GPT instructions? Without API calls, and with perhaps 3 sets of instructions and a sort of orchestrator set of instructions to facilitate?

1 Like

What you did is fantastic. I have been studying conversations between multiple assistants and my doubt (even using your example) is how many times the assistants should ā€˜go back and forth’ with the information, that is, where to define how many revisions should be made and what criteria to use to say: Ok, Now it’s good. Have you thought about it?

2 Likes

Yes, I thought about the same thing. It’s difficult to measure but in the realm of creative writing it seemed that the first exchange between writer and critic yielded most of the value with rapidly diminishing returns after that, but I hesitate to extrapolate to other use cases. For example setting up assistants to debate each other on some topics might require more back and forth exchanges. I expect to see much work to be done on this area, agent/assistant optimization. For a given problem; how many agents/assistants, what expertise do you need to bring to the table, how many iterations on the same topic.

1 Like

Yes, because this was a simple example intended for instructional purposes I think it could be replicated easily enough in ChatGPT. Real world problems are going to require more assistants with specialized expertise, more iterations. The real advantage of programs and APIs in this case is being to scale this up for a real world problems.

1 Like

Hi, a request from a non-coder newbie. I’ve created a set of assistants and like in all your use cases I would like a few of them to collaborate. I don’t want them to automatically decide who to collaborate with right now, but am able to clearly instruct an assistant when it should bring in one named ā€˜xyz’ and so on.

Here’s my challenge, I built a site using bubble, where a conversation between a human user and one assistant happens in what they call a ā€˜repeating group’. What I want to know is how do I get that assistant to refer to the other assistants and how do I then display the messages on bubble as from separate assistants while Also getting the user to feedback to get to a final output?

I’d really appreciate guidance.

1 Like

Did you ever figure this out? Curious if there’s an answer.

@dlaytonj2, @james.kittock some excellent contributions from you both. I have been playing around with multi-assistant workflow. In my scenario I have a drafting assistant, and then 7 reviewing assistants all instructed to provide four key feedback elements. Here are some of my observations:

  1. When you query the thread about how many assistants responded with X, it will say 1 (E.g. There was one assistant (me)…
  2. When you ask different agents how many instances of feedback were received, they respond with a different number of instances (E.g. one agent might say 2, one might say 3, and another might say 4, but none of them respond with the correct answer with should be 7 - one for each assistant)
  3. I appears that there is significant overlap on the feedback given by these assistants (I.e. they don’t appear to be paying attention to each others feedback). I want to play around with better instructions around this or dialling up/down temperature
  4. Despite these flaws, the final draft is undoubtably much higher quality. In my scenario, it paid attention to feedback to include metrics and stats, and to use more emotive language etc - this I found mind-blowing!!!

I am going to play around with some of the suggestions this weekend, especially giving them names in the instructions and prepending the messages with this name - I had a similar thought for the messages, but overlooked including this in the instruction which is a great suggestion.

It is item number two above that has me the most concerned. If all agents have contributed to the same thread, then in theory they should all be able to count the exact number of instances of anything within the thread.

Any suggestions of feedback would be greatly appreciated!

you have to create a several prompts that call different functions depend of context. so in each prompt you can choose call a function that only trigger the activation of other assistant. so in the second assistant you will select same thread but different assistant. i have implemented it…

I cross posted a version of what you are looking for here: On ChatGPT's use of "Personality" in its system prompt, and API use - #2 by icdev2dev

On your concern on number of times that an assistant has made a post is absolutely valid. The way in which my framework works, I enable the counting through the Message itself. i.e. I store the orginator as Metadata for the AutoExecSubMessage that goes to the AutoExecSubThread.

Occasionally a SystemFunction counts and posts the results on the thread and the assistants can use those for guidance on when to speak up or shutup.