Why my AssistantAPI make multiple run step and get caught by rate limitation, how to stop

I created assistant API and I have strange behavior.

When I run

agent = OpenAIAssistantRunnable(assistant_id=assistant_api_id, as_agent=True)
output = agent.invoke({"content": some_content})

I get

Rate limit reached for gpt-3.5-turbo-16k in organization org-rpOPUvwCmXg5MqMNr95gwjGx on requests per min (RPM): Limit 3, Used 3, Requested 1. Please try again in 20s. Visit https://platform.openai.com/account/rate-limits to learn more. You can increase your rate limit by adding a payment method to your account at https://platform.openai.com/account/billing.

because Assistant API makes multiple run steps automatically, it seems like that assistant API cannot stop itself even after making correct output.

How to stop it?

I tried all 3.5 model, excluding 4 model

This limitation is because you are in a free trial. The model that powers the assistant has a low rate limit of requests per minute.

Unfortunately, this will prevent you from evaluating an assistant that makes multiple iterative calls. These iterations are often out of your control, made by internal functions such as for file retrieval.

The solution is to add a payment method and purchase a prepay credit for usage.

You can also code agent-like behavior yourself using the chat completions endpoint, and limit the rate at which API calls are made. This also can be more budget-friendly.

Thank you for your answer.

Sometimes assistant API makes infinite run steps in their system

Do you know how to prevent it?
I tried their playground and I confirmed it tries to make infinite run steps and do the same things over and over.
I want to stop this