Gpt-3.5-turbo-0613 starts to misalign after a few conversation iterations

miguelwon · June 27, 2023, 7:37pm

Had this problem today. I’m using gpt-3.5-turbo-0613 to support a chat conversation for a QA system. It is supposed to call a function is case a question is made. The problem is that after a few iterations (2-3) I start to see some misalignment. The model start to generate text instead of calling the function that is supposed to call. I think is because the previous iterations start to work as in context learning and bias the model to generate text. Any ideias of workaround this issue?

Foxalabs · June 27, 2023, 7:42pm

It’s a tough one, I suspect you’re correct in the assumption that the more data is in the conversational buffer the less there is a priority on the function calling aspect, if you can get some hard data on this it would be SUPER useful if you could post it here, so everyone can get an idea of what the limits are, and possible solutions and work arounds.

As a starting point I see perhaps looking at the API library source and seeing what text gets sent to the model for function calls, and maybe repeating that text every 3 messages? something along those lines should help, My guess it’s a keyword like “[Function]” and then perhaps a json string like the one you use to specify the function definition originally. (I’ve not looked so this is a guess)

miguelwon · June 27, 2023, 7:58pm

Ok, thanks. I see. Yes, perhaps some tags or keywords to signal that the previous iterations a function was call might work. However, not sure if then it starts to thinks that it needs to generate those keywords and json strings…

I’ll try (tomorrow) to replicate an example and share it here.

miguelwon · June 28, 2023, 10:47pm

I did a few more tests one this issue today and I think I found a workaround that gives better results. I changed a bit the main flow of the conversation and now instead of doing a traditional back and forward conversation, I work in a single iteration, where I show in the first turn the full conversation as it was the conversation from two users. In this way, the model acts always as it is its first iteration.

vb · June 28, 2023, 10:54pm

Yes, that’s a good idea! I have started to use “reminders” to consider the whole conversation when I expect that the context will get lost which in your case would translate to a reminder to use the functions when necessary. The prompts get a bit more complex but it’s working better than relying on the model to consider the whole context window everytime by itself.

JessieL · June 29, 2023, 12:29am

Hi, could you explain how to show the full conversation in one turn? Did you mean putting all the previous back and forward conversation in a role? Thanks!

miguelwon · June 29, 2023, 9:14am

Sure. Currently I’m simply doing something like the following:

messages = [{"role": "system", "content": SYSTEM_PROMPT}]
conversation = []
for user_message, bot_message in enumerate(history):
    if user_message:
        conversation.append(">User: "+user_message)
    if bot_message:
        conversation.append(">HelpAI: "+bot_message)
conversation = '\n'.join(conversation)
messages.append({"role": "user", "content": first_prompt.format(chat=conversation)})

Where I prompt the model to guess what will be the next message from the user HelpAI.

JessieL · June 29, 2023, 3:20pm

Thank you for sharing!! I am going to give it a try. I also had the problem to keep the Chatbot stay in character after a few conversation iterations. For example, if I ask it to tell me a joke, the first time it will reply it can’t answer… blah blah. Then I ask it again, it will tell a joke. There are some other problems like this. It seemed to be hard to follow the instruction after a few conversation iterations.

luc_w · July 16, 2023, 7:47am

I noticed the exact same thing and in my case the solution has been to intelligently trim the history if it seems like the user is moving on to a new unit of work.
My app allows calling API functions like machine translation between languages, and it’s meant to receive repeated input sentences in foreign languages to process.

Between each sentence, I ask chatGPT whether the input is a new sentence (if so, trim history), or whether it’s a question regarding the previous sentence (if so , keep history).

Seems to work quite well, and trimming the history is preferable also from a token count standpoint.

Topic		Replies	Views
Model don't follow instructions with Function Calling API gpt-35-turbo , functions , function-calling	4	2088	December 18, 2023
I can't for the life of me figure this out: GPT-4 getting "stuck" on certain response types API	4	1700	August 9, 2023
Persistent Issues with GPT Model: Loss of Contextual Understanding and Continuity in Conversations GPT builders	11	2515	November 22, 2024
How to maintain context with gpt-3.5-turbo API? API	20	20797	December 13, 2023
Maintain the context within the 4096 max tokens API	2	2278	February 16, 2024

Gpt-3.5-turbo-0613 starts to misalign after a few conversation iterations

Related topics