[Assistant API] Discrepancies in Function Triggering and Output Submission in Bulk Message Threads with OpenAI API

I’m reaching out to seek insights into an issue I’ve encountered while working with the OpenAI API. My workflow involves creating a thread with a substantial number of messages (around 20-30), where each message prompts an AI assistant to trigger a specific function then a run. This setup has been functioning as expected, but I’ve noticed an unusual behavior when the thread’s status shifts to “requires_action”.

The challenge arises when the thread changes its status even if the number of messages doesn’t align with the number of triggered functions. My usual approach is to list the run steps, manage the functions that have been triggered, and then proceed to submit the outputs of these tools. However, the issue I face is that when I submit the tool outputs, the OpenAI API seems to expect only a subset of these outputs, rather than the full range that was returned in the previous steps.

This leads to my question: Why does the OpenAI API require only a part of the returned tool function calls? Is there a specific reason or mechanism behind this selective recognition of tool outputs? Any insights into how the API processes these outputs in a multi-threaded context, or suggestions to ensure that all expected outputs are correctly recognized and submitted, would be greatly appreciated.

Looking forward to your thoughts and suggestions.

Best regards,

There is some raw output, yet explicit from my code:

OpenAI must process 31 messages
Run id run_sYSUuFS8CwbfZG809JcxP20W and thread id thread_O9voAKnukrJESwvGJ1g0DEQa
Processed messages 0/31

Tool call: call_oiXAcxU9x73ZTH2iVlRMNwzv 
Tool call: call_gDlsjo6gs8oLM2G1V8qQvR4P 
Retrieved 2 messages
2024/01/13 22:48:45 Status: requires_action & [0xc000469500 0xc000469530]

Processed messages 2/31

Tool call: call_iBncKfQmTLuGQfhgNC0begip
Tool call: call_mdBcrdZJUX5kfZHOmyO1Htfq
Tool call: call_bdRpl2ghV0gi0b0SZFW1YKL1
Tool call: call_oNlTH1UURnw6GQWTaZe08bTZ
Tool call: call_oiXAcxU9x73ZTH2iVlRMNwzv
Tool call: call_gDlsjo6gs8oLM2G1V8qQvR4P

panic: error, status code: 400, message: Expected tool outputs for call_ids ['call_iBncKfQmTLuGQfhgNC0begip', 'call_mdBcrdZJUX5kfZHOmyO1Htfq', 'call_bdRpl2ghV0gi0b0SZFW1YKL1', 'call_oNlTH1UURnw6GQWTaZe08bTZ'], got ['call_iBncKfQmTLuGQfhgNC0begip', 'call_mdBcrdZJUX5kfZHOmyO1Htfq', 'call_bdRpl2ghV0gi0b0SZFW
1YKL1', 'call_oNlTH1UURnw6GQWTaZe08bTZ', 'call_oiXAcxU9x73ZTH2iVlRMNwzv', 'call_gDlsjo6gs8oLM2G1V8qQvR4P']

Curious about your loop.
You create a thread with 31 messages in it?
Or do you create a thread with 1 in it and then run?
just trying to understand. In your output when it says ‘requires action’ do you then Submit those two function call outputs?

Let me correct and clarify the process:

  1. Thread Creation and Bulk Message Addition: Initially, a single thread is created. After its creation, a bulk of messages, around 31 in total, are added to this thread all at once. This is not a sequential addition but a one-time bulk operation.
  2. Execution of the Run: Once the messages are in place, a run is executed on the entire thread. This run is meant to process all the messages in the thread and trigger corresponding functions through the AI assistant.
  3. Discrepancy in Function Triggering: The issue arises when, after the run execution, the number of functions triggered does not match the number of messages in the thread. Specifically, the response back triggers fewer functions than there are messages.
  4. Handling ‘Requires Action’ Status: When faced with a ‘requires_action’ status, the expected course of action is to submit the outputs from the function calls that were triggered. However, the OpenAI API seems to only require a subset of these outputs for submission, not the entire set that one would expect given the number of messages.

Why do you expect a relationship between the number of messages and the number of function calls? They are principally unrelated?

In this specific workflow, each message is designed to trigger a function, hence the expectation of equal numbers. Understanding why this isn’t happening is the key issue here.

Looking at your error message it seems you are sending two more than it expects?
(It’s expecting 4 not 6 - is it possible they come from two different ‘rounds’ and get looped together? Hard to say without the code
Expected tool outputs for call_ids [‘call_iBncKfQmTLuGQfhgNC0begip’, ‘call_mdBcrdZJUX5kfZHOmyO1Htfq’, ‘call_bdRpl2ghV0gi0b0SZFW1YKL1’, ‘call_oNlTH1UURnw6GQWTaZe08bTZ’],

got [‘call_iBncKfQmTLuGQfhgNC0begip’, ‘call_mdBcrdZJUX5kfZHOmyO1Htfq’,
‘call_bdRpl2ghV0gi0b0SZFW1YKL1’, ‘call_oNlTH1UURnw6GQWTaZe08bTZ’, ‘call_oiXAcxU9x73ZTH2iVlRMNwzv’, ‘call_gDlsjo6gs8oLM2G1V8qQvR4P’]

This code creates the thread, the run, and respond to the tool outputs. I only put the relevant functions.


func (c *Assistant) retrieveAssistantFunctionCall(ctx context.Context, run openai.Run) ([]*ChatMessage, error) {
	steps, err := c.client.ListRunSteps(ctx, run.ThreadID, run.ID, openai.Pagination{})
	if err != nil {
		return nil, err
	}
	if steps.HasMore {
		printer.Warn("Run has more steps, we are not handling this case yet")
	}
	cm := make([]*ChatMessage, 0)
	toolCallIDs := make([]string, 0)
	for _, s := range steps.RunSteps {
		if s.Type != openai.RunStepTypeToolCalls {
			continue
		}
		for _, tc := range s.StepDetails.ToolCalls {
			printer.Debug("Tool call: %v", tc.ID)
			if tc.Type != openai.ToolTypeFunction {
				printer.Warn("Tool call type is not a function")
				continue
			}
			cm = append(cm, &ChatMessage{
				Metadata: make(map[string]any),
				Content:  tc.Function.Arguments,
			})
			toolCallIDs = append(toolCallIDs, tc.ID)
		}
	}

	toolOutputs := make([]openai.ToolOutput, len(toolCallIDs))
	for i, tcID := range toolCallIDs {
		toolOutputs[i] = openai.ToolOutput{
			ToolCallID: tcID,
			Output:     `{"success": true}`,
		}
	}

	// Respond to the function calls to process the next messages
	if _, err = c.client.SubmitToolOutputs(ctx, run.ThreadID, run.ID, openai.SubmitToolOutputsRequest{
		ToolOutputs: toolOutputs,
	}); err != nil {
		return nil, err
	}
	return cm, nil
}

func (c *Assistant) RunNewThread(ctx context.Context, messages []*ChatMessage) ([]*ChatMessage, error) {
	printer.Debug("OpenAI must process %d messages", len(messages))
	tm := make([]openai.ThreadMessage, len(messages))
	for i, message := range messages {
		tm[i] = openai.ThreadMessage{
			Content:  message.Content,
			Role:     openai.ThreadMessageRoleUser,
			Metadata: message.Metadata,
		}
	}
	run, err := c.client.CreateThreadAndRun(ctx, openai.CreateThreadAndRunRequest{
		RunRequest: openai.RunRequest{
			AssistantID: c.assistantID,
		},
		Thread: openai.ThreadRequest{
			Messages: tm,
		},
	})
	if err != nil {
		return nil, err
	}
	log.Printf("Run id %v and thread id %v", run.ID, run.ThreadID)
	status, err := c.waitForStatusToBeAcceptable(ctx, &run, openai.RunStatusCompleted, openai.RunStatusRequiresAction)
	if err != nil {
		return nil, err
	}
	l := len(messages)
	processedMessage := 0
	returnedMsgs := make([]*ChatMessage, 0)
	for status == openai.RunStatusRequiresAction && processedMessage < l {
		printer.Debug("Processed messages %d/%d", processedMessage, l)
		var msgs []*ChatMessage
		if msgs, err = c.retrieveAssistantFunctionCall(ctx, run); err != nil {
			return nil, err
		}
		printer.Debug("Retrieved %d messages", len(msgs))
		processedMessage += len(msgs)
		returnedMsgs = append(returnedMsgs, msgs...)
		status, err = c.waitForStatusToBeAcceptable(ctx, &run, openai.RunStatusCompleted, openai.RunStatusRequiresAction)
		log.Printf("Status: %s & %+v", status, msgs)
	}
	return returnedMsgs, nil
}

Let me know if you want me to clarify something.

It seems you are using runsteps to determine the tool calls. As far as I have seen the run steps are ‘observational’ - you can query them to learn more about how GPT comes to its conclusions. But when the RUN has requires_action status it will tell you exactly what calls it needs answered. So change your tool submission so that you ONLY provide those calls that are asked.

Just FYI I run Assistants without every even calling anything that has to do with ‘steps’.

I’m a bit late but it seems that solves this issue despite an other issue which is the run result as complete without solving all messages or calling the functions I don’t think that is solvable since it comes from GPT itself and not the API.
Thanks for your help.

1 Like

Glad you solved one - for the other part I think that might be a prompt problem (or fixable with improved prompting)? Feel free to share more!