Assistant API Repeat the same message

I am using the Assistant API with the model gpt-4-turbo-preview. After about 20 messages of interaction in the same thread, no matter what I send, it repeats the same response message. Is this happening only to me?

Likely not.

What settings are you using? What do your prompts look like? System, assistant, user…

After about 20 messages of interaction in the same thread, no matter what I send, it repeats the same response message. Is this happening only to me?

Could you show your prompt and API parameters (temperature…) ?

Have you experimented with GPT-4 to understand whether this is a prompt or a model-related issue?

I haven’t provided any complex instructions, just the flow of conversation I want to achieve with Assistant. Recently, I tested the GPT-4 model and encountered an issue where arguments were not passed in function calls. Consequently, I tried it with the 3.5-turbo-16k model and didn’t observe the same behavior. However, considering the accuracy of the conversation, I found that the 4-turbo model performed better, so I continued using it.

My implementation is quite simple. I input Instructions to the Assistant and specify the Assistant’s ID in the thread to continue the conversation. The temperature setting and others are left at their defaults.

Do you see the repeat of messages in the playground?

You can access it

https://platform.openai.com/playground?assistant=[assistant_id]&mode=assistant&thread=[thread]

20 coincidentally is the default limit of messages returned.

I haven’t tried it in the Playground. Is there any documentation mentioning a 20-message limit?

https://platform.openai.com/docs/api-reference/messages/listMessages

Query parameters
limit
integer
Optional
Defaults to 20

A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order
string
Optional
Defaults to desc

Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after
string
Optional

A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before
string
Optional

A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
Returns

A list of message objects.

Not saying it’s the reason. Just figured I’d mention it.

I apologize if this sounds stupid…

But instead of getting technical in a prompt can you just write it out and play an English that you want your product descriptions to use different words?

I am also getting the same issue from Assistant API. When I am calling the API in a loop, it returns the same message as prompt for the 2nd iteration in the loop.

Is it resolved it?

To anyone dealing with this issue:

Access the conversation via the playground to first confirm it’s not an issue with the code (not necessarily yours)

https://platform.openai.com/playground?assistant={assistant_id}&mode=assistant&thread={thread_id}

If you see a repeat there, access the run steps

https://platform.openai.com/docs/api-reference/runs/listRunSteps

For help debugging please post your run steps here.

1 Like

I figured out the root cause… If your token limit is reached per day or per minute (though it was within the limit when the request was made), the response is not given. After I put a thread to sleep for a minute, the looping works for me.

I have another issue with tokens. will open a new chat about it.

I poll the Thread for new messages, get the next messages after the last received messages. However, sometimes I got a valid text message but it was empty. The real message then never arrived.

So I debugged this and it turns out that I first get an empty message and then a little bit later the same message id is used for the actual message.

I.e. the same ID has two different messages, as if the message is first created (and I detect it) and then a bit later it is modified to contain the content.

Can anybody verify this? Is there a way to generically detect this half-way messages?

Can you log the raw data that gets received?, i.e. wireshark it or something, that way you can be 100% sure that the data was duplicated with an empty string and a populated one, if you find that to be the case, you should create an API bug entry as that should not be happening, at least not to the best of my knowledge.

This is the log from a background process polling the message thread. It lists messages from the id of the last received message (initially null or not set).

{"assistant_id":null, "thread_id":"thread_E16Yj95Onb0ZNaUVZcnqaC9Y", "run_id":null, "role":"user", "content":[{"text":{"value":"list the content of the bnd files in the current projects", "annotations":[]}, "type":"text"}], "file_ids":[], "id":"msg_CFEgb2GpBfmHk3FIy5RmxpGX", "metadata":{}, "created_at":1707837629, "object":"thread.message"}
id msg_CFEgb2GpBfmHk3FIy5RmxpGX
last id = msg_CFEgb2GpBfmHk3FIy5RmxpGX

Then later I get a message from the assistant but it is empty

{"assistant_id":"asst_wffL2sCOXGIsFnEqQlRtZWV7", "thread_id":"thread_E16Yj95Onb0ZNaUVZcnqaC9Y", "run_id":"run_9zR447d886mANtOQ1aQJDn2m", "role":"assistant", "content":[{"text":{"value":"", "annotations":[]}, "type":"text"}], "file_ids":[], "id":"msg_y1ThJlxAYaC6QPxjtu3WLy0W", "metadata":{}, "created_at":1707837635, "object":"thread.message"}
received empty with id msg_y1ThJlxAYaC6QPxjtu3WLy0W (removes message)
id msg_y1ThJlxAYaC6QPxjtu3WLy0W

In this test, I remove the empty message from the list of received messages so it will not be used as lastId, this honor befalls the first message

last id = msg_CFEgb2GpBfmHk3FIy5RmxpGX

Next poll it will get the actual message with the same id as before!

{"assistant_id":"asst_wffL2sCOXGIsFnEqQlRtZWV7", "thread_id":"thread_E16Yj95Onb0ZNaUVZcnqaC9Y", "run_id":"run_9zR447d886mANtOQ1aQJDn2m", "role":"assistant", "content":[{"text":{"value":"The content of the bnd files in the current pro...the bnd configuration file for the `test1` project", "annotations":[]}, "type":"text"}], "file_ids":[], "id":"msg_y1ThJlxAYaC6QPxjtu3WLy0W", "metadata":{}, "created_at":1707837635, "object":"thread.message"}
id msg_y1ThJlxAYaC6QPxjtu3WLy0W
last id = msg_y1ThJlxAYaC6QPxjtu3WLy0W

Problem is I do a lot of code without knowing the message type, so it is hard to check for ‘empty text’ :frowning:

I did spend 5 minutes googling but could not find where OpenAI accepts bug reports, do you have a link?

When the model’s token limit is exceed in the response, unless it is a TPM or a RTM (as the API documentation details), then a number of undesirable outcomes can happen, sometimes the message will be left incomplete sometimes it might get really messy and you won’t know, because till this day OpenAI is struggling to effectively return us an Exeption to be caught in such scenarios. If you exceed the token limit in the request, the RPM or RTM then it will terminate your run and return an error.

So, my tip is to NOT even get close to the limit. Use tiktoken to process your data in chunks so that you can process your whole dataset safely, without random erros or ‘nulls’.

Token limit is not the issue here.

These messages are not very long and in the same run, a bit later, the message appears.

Hi,

My answer was directed to @siva2 .

Regarding you problem I faced something (probably) similar in the past. I have a series of social media comments that I process in batches. Sometimes, the API was changing the order of such comments and that was causing my program to fail, because I needed it to be organized because the response was also based on such order.

It is a bug, if it is your case, you will have to implement a series os strategies to overcome this both in prompt and in your response.

I’ve moved this thread to the API bugs section :+1: