Assistant Keeps Running in Loop, Exceeding Expected Token Usage

rishish · February 12, 2024, 12:01pm

Hello OpenAI Community,

I’ve encountered an issue with the Assistant API using the GPT-3.5 Turbo model where the assistant unexpectedly entered a loop during execution, significantly exceeding the expected token usage. Despite intending for a single execution based on a straightforward prompt, the run resulted in a total of 55878 input tokens being used, which is well beyond the typical limit for a single interaction.

Assistant Thread ID: thread_2HWNotG3WRWgIlKAwIyUivIQ
Assistant Run ID: run_VY6ZywFuR76Xu4nwXL9X1UtQ

Upon reviewing the run steps, it appears that the assistant kept executing in a loop until it reached the token limit for my account. This behavior was unexpected as the model was supposed to run only once and then stop, especially considering that our assistant was not configured to use any recursive function calls or similar mechanisms—it was a simple prompt.

Key Details:

The run used a total of 55878 input tokens.
There were no explicit recursive or looped function calls in our prompt or configuration.
The expected behavior was a single execution in response to the user’s prompt.

I’m looking for guidance on the following:

How can we prevent the assistant from entering such loops, especially when the prompt and use case do not inherently require or suggest multiple executions?
Is there a specific parameter or configuration option that can be passed to ensure the assistant runs only once and stops, avoiding unintended token consumption?

Any insights, suggestions, or guidance on mitigating this issue and understanding how to control token usage more effectively in similar scenarios would be greatly appreciated. Ensuring efficient and predictable token usage is crucial for us, and understanding the root cause of this loop behavior is essential.

Thank you for your assistance.

_j · February 12, 2024, 12:10pm

Assistants have their own functions injected.

gpt-3.5-turbo-0613 is not compatible with retrieval or parallel multi-tool.

The quality of the assistants framework is out of your control, where, for example, the AI might not understand the errors it already made, or is confused by having its full context injected with functions and irrelevant file knowledge.

AI is often loopy and repeat-y, and will find patterns and repeat them naturally. You have no parameters available to control such behavior with sampling, penalties, bias, etc. with assistants.

merefield · February 12, 2024, 12:10pm

Similar behaviour has been reported before. I hope it’s on their radar:

jlvanhulst · February 12, 2024, 4:14pm

Did you check out the thread in the backend? My guess would be the function call either falls or gives back unexpected results and then gets recalled. You should be able to see the details (including data and out) in the thread. Enable threads in Settings → Organization and then check out the thread in the main side menu.

jdranschak · February 12, 2024, 8:07pm

To fix this isn’t possible to use the Max _Token and set a limit so it doesn’t exceed on the output?

_j · February 12, 2024, 9:56pm

That part of the statement is correct. It isn’t possible to use a max_token parameter to set a limit, as no limitation mechanism is provided.

The largest concern is input token consumption, because conversation history and documents that are added to input are unchecked and maximized.

elliotcox7 · February 13, 2024, 7:10pm

+1 on this, getting this today - I get this both via API calls in my app and in OpenAI playground.

akshayrajagopal10 · July 5, 2024, 7:04am

I’ve been encountering this as well sometimes, is there any fix to this?

Topic		Replies	Views
API to gpt-3.5-turbo-16k getting stuck in a loop until it reaches max tokens API	7	1271	November 21, 2023
Assistant API v2: max_prompt_tokens gets exceeded, barely, consistently Bugs	5	1020	July 4, 2024
Assistants API token usage and pricing breakdown clarification API gpt-4 , api , assistants	10	10605	February 6, 2024
Assistants API (gpt-3.5-turbo-16k) usage exceeds limit due to message loop Bugs gpt-35 , gpt-35-turbo , chatgpt	18	6703	December 23, 2023
Too many input tokens are used by Assistant Feedback assistants-api	2	237	November 20, 2024

Assistant Keeps Running in Loop, Exceeding Expected Token Usage

Related topics