GPT-3.5 Assistance keep recall the same tool

I created an assistance with a simple function call/tool:
create_and_save: this function will try to save new records into my database and return the result like “Create successfully” or “Record existing” to assistance.
I created a loop to handle the process looks like this:

//using the same thread_id
submit message
start new run from thread
while True:
    check run status
    if run.status == 'requires_action'
          response = create_and_save(argument)
          // response will be  "Create successfully" or "Record existing"
          submit tool output to assistance
     elif run.status == 'completed':
          get thread messages 
          return message
     elif run.status == 'failed':

When the assistance runs, sometimes it keeps recalling create_and_save after receiving the function call response instead of completing the processing and making an assistance response.
I did some tests on the playground but I never met this problem there. But when I call from my loop locally, even if I break the process into small steps and run it manually, I encounter this problem again.
Has anyone faced this problem before and how should I deal with it?

It mostly cannot be fixed on your end, because you have no control over the quality of the “thread”, the previous conversation.

However, you must at a minimum supply the AI the conversation thread or it will have no record that it called the function and no place to store its own calling history, and will continue doing the same action just on the user input, and any prompting to “only call a function once and then answer user query, iteration is not permitted” will be unsuccessful because the AI doesn’t see its prior function tool calls.

The only thing you can do is provide an assistant “function disabled, answer user with pretrained and existing knowledge only” API returns if you can track which instance is calling.

Better: don’t use this if you can code.

Thank you!
I using the latest assistance API and as I read from its documentation, it supports process step history.
When I retrieve this list for my thread/run, it is a list of the same requests and responses from my function, and no idea about what happening.
I can save the previous function call and it responds to check before executing create_and_save again in local but I don’t think it is a good solution.

I see a similar bug for assistant runs without using functions.
Messages get re-created up to 20 times on the GPT-3.5 model. GPT-4 works fine but cost 10 times as much.