My gpt4 app keeping cache and take same output all time. I dont want it because api can keep wrong answer and it say this all time. How can ı block that.
What do you mean by that?
My gpt4 api return same output.
Can you explain what you are trying to achive?
I mean you could change temperature parameter. But be careful and check results. GPT are not made to find truth.
Are you maybe using the Assistants API?
There is a default limit of 20 messages after which the last message in the conversation is being repeated. But this can be adjusted.
That’s not how it works, unless you are writing bad code that expects to get the last of an ascending order list of all thread messages with every request of infinite size. You can send a different API parameter, but that gets you max 100. The proper way is to leave the messages in the descending sort order, and just pick a few of the recent off the top, the others besides the latest just if you want to validate the expected turns. Or stream events.
However, normal programmatic use of the same thread ID over and over will indeed give an AI that learns your preference for speaking Spanish. A thread is a conversation session that provides chat history memory.
Perhaps there is just a misunderstanding that:
- reusing the same thread ID continues a chat session;
- creating a new thread starts a blank chat session ready for the first user message.
But it’s an error that could be made and then this could happen.
?
Memory of a chat should not be a blackbox…
It’s not. Merely grey. What @vb describes is “list messages” of a thread, where you can see every message that is in a thread, assistant or user. You need to paginate and make more successive calls if the message list return is larger than one API call. (Using that method improperly to retrieve messages could result in always getting the same 20th message as “the latest” – not this symptom.)
There are also “run steps”, so you can get an idea of internal tools used (but this is the part that is in a thread but is not disclosed to the developer as messages).
The method actually used, the code employed, still has not been offered, leaving us to do no more than guess.
Yeah you have an ocean of messages and you buy them frozen and packed up from a supermarket… Nah… let me fish them by myself. Give me access to the fishing boat and let me check the size of the nets…
We will have to wait for OP to clarify.
But if the reply from the model is seemingly always the same it could have several reasons.
Still unclear how you can rule out this particular culprit from the original problem description.
Sorry - there’s two similar threads without OP response - the other is also a “cache” complaint, but about the AI somehow remembering a preference for Spanish language.
I believe the symptom described here is the AI model in a chat getting “hung up” on a previous answer, unable to break away from a bad pattern of responses after a growing chat.
This is unfortunately common in the recent models - get 10 messages in, and you get old turns of code output by the AI passed off as the “fixed” version again, the AI simply not following new instructions, unable to iterate improvements. You can say “try again” and the AI will alternate between two bad answers, and the chat must be abandoned.
Retrieving messages is the very foundation of how Assistants works. You place a message. You poll and then retrieve the new message placed by the AI into a thread when ready. There’s much better reasons to not use Assistants than this.
That was more like an “at least” addition to my “I want to fish by myself”…
I am totally aware that I can do that…