Assistants API Multi Assistant Agentic Workflow

My idea is to develop a proxy assistant that serves as a proxy and decision-maker for specialized sub-assistants, each dedicated to particular tasks and accessible via function calls. However, when the proxy assistant recognizes the need to engage one of these sub-assistants, it encounters an issue: the run gets locked in a “requires_action” state, preventing the initiation of another run with the sub-assistant on the same thread.

I’m aware that “When a Run is in_progress and not in a terminal state, the Thread is locked.” so this is the expected behavior, but…

Has anyone else built a similar solution in a different way or found a workaround for this issue?

Also don’t want to use a framework like CrewAI.


Hi Andres

First of all, Assistants API is still in beta, so do not expect something fully useable in production.

Second, it would be beneficial if you share the code and the logs. In many cases the problem may be in the implementation rather than in the protocol.

How are you implementing the sub-assistants? I am thinking, perhaps you can create an endpoint for each sub-assistant and just call them like an external API.

As in Assistants API an assistant and a run are separate notions, you can do runs with any number of different assistants within one thread. The only problem is these assistants are not aware of each other and consider all previous messages from assistants as their own.

1 Like

I will break down the reasoning into two concepts to facilitate understanding. My initial idea was to have several assistants operating within the same thread and a proxy assistant that would act as an orchestrator. Each sub-assistant is a function call, so when the proxy assistant identifies the need for a sub-assistant, it enters the ‘requires_action’ state, and apparently, this state is considered as a state in progress. Therefore, I can’t run the sub-assistants in the same thread because it gets locked.

The second approach I followed consists of the same set of proxy assistant and sub-assistants, but when the proxy assistant identifies the need for a sub-assistant, an individual thread is created in the sub-assistant and the necessary information is passed through a function call. This allowed me a back-and-forth between the two threads like a conversation. I ended up doing something like this: Optimizing AI: Building Advanced Multi-Agent Frameworks with Azure OpenAI Assistant API

This works, however, this back-and-forth between the threads takes a lot of time and often OpenAI ends up canceling the run or marking it as expired.

1 Like

By function calls. The same thing would happen since I would be trying to start another run in the same thread that is in the “requires_action” state.

So when you try and hit an API in a GPT it asks for permission from the user before it continues. Maybe it’s hanging up, waiting for user authorization? If so, perhaps there is a way you can programmatically respond and move on?

Yeah, sometimes it does. Since my front end is a chat, users can reply to keep the chat going. This is good because it starts a new run. I’ve noticed that sometimes OpenAI cancels a run if it’s taking too long to ‘complete’, even if it’s still ‘in progress’.

This guy Assistants API - Access to multiple assistants - #25 by james.kittock implemented a solution similar to what I had in mind. I plan to test the shared thread approach he detailed to determine if it offers any advantages.

Interesting approach!

I’ll try it on my use-case. As of now, I’m using the following architecture:

  • Several function specific assistants
  • 1 thread

I start with a welcoming assistant which has 1 tool. This tool besides performing the needed action, moves the thread to the next assistant and also updates its instruction based on the input from the conversation with the first one.

For example, a chatbot site editor:

  • welcoming assistants requests the page the user wants to manage
  • once received, tool is used that adds page_id to the metadata and changes current assistant id to the next one
  • next assistant can edit text of the page (tool 1), layout (tool 2) and change page settings (tool 3)
  • there can be next assistant to which we move after any tool use

Forgive me if I totally misunderstand your problem but why not copy the messages in the thread and create a new thread with those messages to give the sub-assistants to run on?

Sharing another experimental approach for multi-agent workflow for your information and ideation further.

azureai-assistant-tool/samples/MultiAgentCodeOrchestration at main · Azure-Samples/azureai-assistant-tool (

In that sample, user communicates with the TaskPlannerAgent (which is using chat completion api) in their own thread. The TaskPlannerAgent creates and schedules tasks for the necessary assistants to fulfill the user’s request, based on the request itself and knowledge about the available assistants (See YAML configurations for CodeProgrammerAgent, CodeInspectionAgent in config folder).

1 Like

Think of my example, as a kind of “hello world” for assistants. I purposely kept it simple. In this case I saw “writer” and “critic” as peers so the notion of sub assistants and multiple threads really didn’t apply, but for more complex real world examples you are right. In a world without AI, can the problem be solved by getting a small group of ‘experts’ into a room or does the solution require a larger team where small groups going into breakout sessions who then report back for further discussion until a solution is reached? Assistant/Agents require us to think about what kind of expertise we require, and the structure of the team doing the problem solving.


What was your example? I don’t see any other posts from you in this thread!

Quick update guys,

Thanks for sharing!

For the first production version, I ended up with two Assistants like this:

  • Task Planner Agent: responsible for creating the task execution plan and executing it.
  • Critic Agent: responsible for reviewing the generated content and providing feedback.

Task Planner Agent calls Critic Agent through function call.

I kept the strategy of a main assistant with a main thread and secondary threads for the sub-assistant. This approach worked best for me. Most of the time, the final output has better quality than just one agent working on the thread

Couldn’t handle the Run being canceled properly in order to execute the sub-assistant on the main thread.

This would be a huge increase in token usage. And Assistants API doesn’t work like Chat Completion API, I can’t add a message with the assistant role in the thread.

Btw, welcome to the community @WokeBloke!!! Thanks for joining us.

Totally agree.

However, passing the chat history to a secondary thread in a coherent way through function calling is a challenge. Sometimes it works, sometimes it sends half of the information to the sub-assistant.

So I wonder if with a larger set of Assistants we can manage this information transfer via function calls without losing relevant information along the way.

I hope this multi agents workflow is addressed in the next major update from OpenAI.

I’ll rollout this version and gather user feedback.

Note: I ended up implementing message streaming, so this has decreased the response time a bit.

Sorry, wrong thread apologies for the confusion. … Assistants API - Access to multiple assistants

Andres, I think you can use the event system to solve this. Just queue the operation that needs access to the thread and run it when you get a message.

1 Like

Hey @andres_santos! I’ve set out to create the same orchestration model you have with one proxy assistant and thread, and several sub assistants and their threads.

Preface - I’m not an engineer or developer, I’m just super keen on trying this type of orchestration on a side-project I’m working on. I use the Azure example of Multimodal Multi Agents but I can’t quite get there even though I detached Azure dependancies and rewrote the API calls to spec.

Would you be willing to share this project of yours to the community?