Assistants API Multi Assistant Agentic Workflow

andres_santos · April 4, 2024, 6:37pm

My idea is to develop a proxy assistant that serves as a proxy and decision-maker for specialized sub-assistants, each dedicated to particular tasks and accessible via function calls. However, when the proxy assistant recognizes the need to engage one of these sub-assistants, it encounters an issue: the run gets locked in a “requires_action” state, preventing the initiation of another run with the sub-assistant on the same thread.

I’m aware that “When a Run is in_progress and not in a terminal state, the Thread is locked.” so this is the expected behavior, but…

Has anyone else built a similar solution in a different way or found a workaround for this issue?

Also don’t want to use a framework like CrewAI.

TonyAIChamp · April 5, 2024, 1:55am

Hi Andres

First of all, Assistants API is still in beta, so do not expect something fully useable in production.

Second, it would be beneficial if you share the code and the logs. In many cases the problem may be in the implementation rather than in the protocol.

supershaneski · April 5, 2024, 4:13am

How are you implementing the sub-assistants? I am thinking, perhaps you can create an endpoint for each sub-assistant and just call them like an external API.

TonyAIChamp · April 10, 2024, 2:44am

As in Assistants API an assistant and a run are separate notions, you can do runs with any number of different assistants within one thread. The only problem is these assistants are not aware of each other and consider all previous messages from assistants as their own.

andres_santos · April 10, 2024, 12:22pm

I will break down the reasoning into two concepts to facilitate understanding. My initial idea was to have several assistants operating within the same thread and a proxy assistant that would act as an orchestrator. Each sub-assistant is a function call, so when the proxy assistant identifies the need for a sub-assistant, it enters the ‘requires_action’ state, and apparently, this state is considered as a state in progress. Therefore, I can’t run the sub-assistants in the same thread because it gets locked.

The second approach I followed consists of the same set of proxy assistant and sub-assistants, but when the proxy assistant identifies the need for a sub-assistant, an individual thread is created in the sub-assistant and the necessary information is passed through a function call. This allowed me a back-and-forth between the two threads like a conversation. I ended up doing something like this: Optimizing AI: Building Advanced Multi-Agent Frameworks with Azure OpenAI Assistant API

This works, however, this back-and-forth between the threads takes a lot of time and often OpenAI ends up canceling the run or marking it as expired.

andres_santos · April 10, 2024, 12:24pm

By function calls. The same thing would happen since I would be trying to start another run in the same thread that is in the “requires_action” state.

Appertivo · April 10, 2024, 12:34pm

So when you try and hit an API in a GPT it asks for permission from the user before it continues. Maybe it’s hanging up, waiting for user authorization? If so, perhaps there is a way you can programmatically respond and move on?

andres_santos · April 10, 2024, 12:55pm

Yeah, sometimes it does. Since my front end is a chat, users can reply to keep the chat going. This is good because it starts a new run. I’ve noticed that sometimes OpenAI cancels a run if it’s taking too long to ‘complete’, even if it’s still ‘in progress’.

This guy Assistants API - Access to multiple assistants - #25 by james.kittock implemented a solution similar to what I had in mind. I plan to test the shared thread approach he detailed to determine if it offers any advantages.

TonyAIChamp · April 10, 2024, 11:46pm

Interesting approach!

I’ll try it on my use-case. As of now, I’m using the following architecture:

Several function specific assistants
1 thread

I start with a welcoming assistant which has 1 tool. This tool besides performing the needed action, moves the thread to the next assistant and also updates its instruction based on the input from the conversation with the first one.

For example, a chatbot site editor:

welcoming assistants requests the page the user wants to manage
once received, tool is used that adds page_id to the metadata and changes current assistant id to the next one
next assistant can edit text of the page (tool 1), layout (tool 2) and change page settings (tool 3)
there can be next assistant to which we move after any tool use

WokeBloke · April 11, 2024, 12:24am

Forgive me if I totally misunderstand your problem but why not copy the messages in the thread and create a new thread with those messages to give the sub-assistants to run on?

jhakulin · April 11, 2024, 3:15am

Sharing another experimental approach for multi-agent workflow for your information and ideation further.

azureai-assistant-tool/samples/MultiAgentCodeOrchestration at main · Azure-Samples/azureai-assistant-tool (github.com)

In that sample, user communicates with the TaskPlannerAgent (which is using chat completion api) in their own thread. The TaskPlannerAgent creates and schedules tasks for the necessary assistants to fulfill the user’s request, based on the request itself and knowledge about the available assistants (See YAML configurations for CodeProgrammerAgent, CodeInspectionAgent in config folder).

dlaytonj2 · April 15, 2024, 5:56pm

Think of my example, as a kind of “hello world” for assistants. I purposely kept it simple. In this case I saw “writer” and “critic” as peers so the notion of sub assistants and multiple threads really didn’t apply, but for more complex real world examples you are right. In a world without AI, can the problem be solved by getting a small group of ‘experts’ into a room or does the solution require a larger team where small groups going into breakout sessions who then report back for further discussion until a solution is reached? Assistant/Agents require us to think about what kind of expertise we require, and the structure of the team doing the problem solving.

WokeBloke · April 15, 2024, 9:05pm

What was your example? I don’t see any other posts from you in this thread!

andres_santos · April 16, 2024, 11:14am

Quick update guys,

Thanks for sharing!

For the first production version, I ended up with two Assistants like this:

Task Planner Agent: responsible for creating the task execution plan and executing it.
Critic Agent: responsible for reviewing the generated content and providing feedback.

Task Planner Agent calls Critic Agent through function call.

I kept the strategy of a main assistant with a main thread and secondary threads for the sub-assistant. This approach worked best for me. Most of the time, the final output has better quality than just one agent working on the thread

Couldn’t handle the Run being canceled properly in order to execute the sub-assistant on the main thread.

This would be a huge increase in token usage. And Assistants API doesn’t work like Chat Completion API, I can’t add a message with the assistant role in the thread.

Btw, welcome to the community @WokeBloke!!! Thanks for joining us.

Totally agree.

However, passing the chat history to a secondary thread in a coherent way through function calling is a challenge. Sometimes it works, sometimes it sends half of the information to the sub-assistant.

So I wonder if with a larger set of Assistants we can manage this information transfer via function calls without losing relevant information along the way.

I hope this multi agents workflow is addressed in the next major update from OpenAI.

I’ll rollout this version and gather user feedback.

Note: I ended up implementing message streaming, so this has decreased the response time a bit.

dlaytonj2 · April 17, 2024, 7:06pm

Sorry, wrong thread apologies for the confusion. … Assistants API - Access to multiple assistants

drew.gillson · May 10, 2024, 3:42pm

Andres, I think you can use the event system to solve this. Just queue the operation that needs access to the thread and run it when you get a thread.run.completed message.

alex75 · May 10, 2024, 4:53pm

Hey @andres_santos! I’ve set out to create the same orchestration model you have with one proxy assistant and thread, and several sub assistants and their threads.

Preface - I’m not an engineer or developer, I’m just super keen on trying this type of orchestration on a side-project I’m working on. I use the Azure example of Multimodal Multi Agents but I can’t quite get there even though I detached Azure dependancies and rewrote the API calls to spec.

Would you be willing to share this project of yours to the community?

dlovric2 · June 3, 2024, 8:01pm

Hey @andres_santos

I was l doing something similar to you and here is what I’ve done to overcome this issue.

You have 1 ProxyAssistant (or whatever you wanna call it) who is responsible for delegating the tasks to other agents/assistants.

When the user q comes in - he calls the method ‘choose_agent’ and with that goes into a state ‘requires_action’

When I call the appropriate agent he selected (lets say he selected SubAssistant), I copy over the messages from ProxyAssistant Thread onto a new thread and this new agent calls a bunch of methods and let’s say that it generates a response. This response is now in the SubAssistant thread.

So I copy over the response from the SubAssistant but I can’t attach it to the thread of the ProxyAssistant because it is locked.

I did the following:

Submit tools - I called the submit_tools endpoint to try and finish the original ProxyAssistant thread
I then take the run_id that ProxyAssistant used to call the agent and I call the ‘cancel’ run endpoint on that run.
I can now add the response from the SubAssistant to the ProxyAssistant thread and initiate a new run.
NOTE: When cancelling a thread it can go into a state ‘cancelling’ before it is ‘cancelled’. And also, there could be some race condition because I encountered this today:

148 if self.decider_run and self.decider_run.status == "requires_action":
--> 149   self.client.beta.threads.runs.cancel(thread_id=self.decider_thread.id, run_id=self.decider_run.id)
  150   self.logger.info(f"Cancelled a run with ID: {self.decider_run.id}")


BadRequestError: Error code: 400 - {'error': {'message': "Cannot cancel run with status 'cancelled'.", 'type': 'invalid_request_error', 'param': None, 'code': None}}

I ended up doing some hacky while checks to ensure it was cancelled before I proceed

I agree this is not a perfect or even production-quality solution, but does the trick for now. Would love to hear how have you resolved this

Topic		Replies	Views
Optimal Approach for Managing Chat History in a Multi-Assistant System API gpt-4 , chat-completion , assistants , assistants-api	13	5564	April 11, 2024
Multiple concurrent run same thread Id API	2	1055	May 28, 2024
I need help using assitant in python (required_action infinite loop) API python , assistants-api	12	11038	January 3, 2025
Assistant API using tools from a different assistant API	2	914	April 3, 2024
Assistants API don't allow perform two concurrent runs on same thread API	8	4230	August 12, 2024

Assistants API Multi Assistant Agentic Workflow

Related topics