Why is ChatGPT so moody, refusing to answer the exact same question it answered previously?

tomgiam · March 27, 2023, 9:31am

I’m using ChatGPT 3.5 (on waiting list for 4) to get responses for travel schedules in json format. It usually works, but sometimes, and quite often today, the response is “Sorry, an an AI language model I cannot…”?

taivo · March 27, 2023, 9:45am

You might be able to improve the prompt. For generating code, I’ve found that “make an app that…” fails similarly as you describe, but “output the code for an app that…” succeeds.

More generally, perhaps you can try giving instructions the model is definitely OK to follow (“output the JSON for X”) as opposed to more informal “do X”.

tomgiam · March 27, 2023, 9:55am

Thanks for your help. I have done some of that, but I will try some more variations on the prompt. The problem is that once I find one that works, it works for quite a while until it doesn’t, so it’s hard to test the solution when the response varies. It also complains in different ways. Sometimes it says that it can’t answer because as an AI language model it does not have any experience or opinions, and sometimes it says that as an AI language model it can reply in text, and it does. Currently it claims not to be able to answer in JSON even though it can and it has. In the first case (when it says that it doesn’t have any experience) it does provide a JSON response, so it’s just a matter or removing the extra text in front of the JSON string.

mcheselka · March 27, 2023, 2:02pm

responses also depend on the ‘temperature’ setting:

cheers!

tomgiam · March 27, 2023, 3:50pm

I’m usinng these parameters:

“model”: “gpt-3.5-turbo”,
“messages”: [{“role”: “user”, “content”: $q}],
“temperature”: 0,
“max_tokens”: 3048,
“top_p”: 1,
“frequency_penalty”: 0,
“presence_penalty”: 0

Should I change, or remove any of them or add others?

Thanks!

mcheselka · March 27, 2023, 3:58pm

maybe work with it a little in the playground first, or maybe you’re already doing that? as you already know, is seems to be very dependent on the prompt. I’d have to confirm this in the tech docs, but I’d guess that if there’s a number of different responses to your prompt, even a temp=0 will have some random scatter. LOL normal physics doesn’t apply!

tomgiam · March 27, 2023, 4:08pm

Thanks for helping. I have tried it in the playground as well as many times in the app. The problem is that it can work correctly for quite a while and quite a few days in a row until suddenly… something is different. I characterize it as ChatGPT having “multiple personalities” and it may be a byproduct of the way that neural networks work or maybe the load on the system or some other factors. In my app I allow users to personalize their vacation by changing their prompt e.g. adding “Dog friendly” as a modifier and I notice it more in those cases, but understandably the input is different even if slightly, but not enough to say that it can’t provide the response in JSON format.

anon10827405 · March 27, 2023, 4:19pm

It’s all about phrasing your words.

It’s pretty much exactly telling you what you’re doing wrong.
Re-phrase your question to not be opinionated or reflective.

Q. playa del carmen or cancun beach, which would you prefer?
A. As an AI language model, I do not have personal preferences or experiences, but I can provide you with some information about both

Q. What are the differences between playa del carmen, and cancun beaches? What is more popular among young tourists with dogs?
A. Playa del Carmen and Cancun are both popular beach destinations in Mexico, but they have some differences in terms of atmosphere[…]

tomgiam · March 27, 2023, 4:41pm

The prompt is sent via API from inside of a program, so I can’t change the prompt based on the response, at least not easily. A typical sample question is:
Output a detailed 1-day vacation schedule for Florence Italy starting on Mar 27. Provide the response in RFC8259 compliant JSON, following this format.
[{
‘Date’: ‘the date to visit the attraction, in Month Day format’,
‘Time’: ‘the time to visit the attraction, in 24 hour format’,
‘Duration’: ‘the amount of time to visit the attraction, in minutes’,
‘Attraction’: ‘the name of the attraction’,
‘City’: ‘the city in which the attraction is located’,
‘Category’: ‘the category of the attraction’,
‘Description’: ‘a description of the attraction’
}]

this works in the sandbox and in the program “almost” all of the time.

anon10827405 · March 27, 2023, 4:45pm

Interesting.

Can you add onto this program? Perhaps a second prompt to adjust the question to not be opinionated? There’s no way that the following prompt that I made is efficient, but it may be something to build on?

If you are going to answer something that begins with “As an AI…” to the following question, instead please re-format the question to be less opinionated so that it can be answered properly.
<|SEP|>
“playa del carmen or cancun beach, which would you prefer?”

Response: Which destination, Playa del Carmen or Cancun, has more to offer in terms of beach activities and amenities?

If you are going to answer something that begins with “As an AI…” to the following question <— This part is probably completely useless and would probably be better as “if the question is opinionated blah blah blash”. Worth testing.

tomgiam · March 27, 2023, 5:29pm

There are 2 problem categories. When it responds with “Sorry, as an AI language model I don’t have experience…” it turns out (at least from the samples I’ve seen) that after saying that, it still does respond correctly in JSON, so the solution is to simply remove the text before the JSON response. However, when it responds with “Sorry, as an AI language model I can’t…” it also continues to respond but in text format, and not in JSON format. This one is more complicated to fix and opens up other possible problems. I could try to insist on it responding in JSON, but I would like to find a way to prevent the problem from occurring. Like they say, an ounce of prevention is worth a pound of cure.

anon10827405 · March 27, 2023, 5:34pm

Seems like you’re stuck between a rock and a hard place.

The immediate solution to me would be to “catch” or filter these types of responses, and then re-feed them modified. But you’re saying that you cannot (easily) do that.

I’m not too sure what your current setup is that you cannot modify a prompt after it’s sent out. To me this already seems like a very concerning situation. What about malicious prompts? What about spam? What if the response isn’t the expected JSON object? Is this all managed by whatever you are using to process the prompt?

Realistically, if you are requiring a very delicate format such as a compliant JSON object, you should be pre-processing and post-processing your prompt & results

I wouldn’t even care if OpenAI said: “Hey, GPT-5 now is 100% perfect at responding with JSON!”. I would still be confirming that before accepting it.

This could be as simple as restructuring the prompt to avoid the “As an AI…”, and then confirming that the response is just a compliant JSON object. Otherwise, if it does send text along with the json object, your pipeline will be ruined and your json file or database could become corrupted.

Can you change the prompt before the response? I imagine you must have some sort of front-end service like a website or chatbot that takes in this information first. Surely you can work on that and clean it?

tomgiam · March 27, 2023, 7:25pm

I do have a way to catch these types of responses and I could refeed them, but what would I change this prompt to when it fails to give me JSON? Each response takes 20-30 seconds, so the user will get tired of waiting if I try too many times.

tomgiam · March 27, 2023, 7:40pm

I just got the email that says that I can use GPT4. I will try that to see if it resolves or minimizes this issue.

anon10827405 · March 27, 2023, 7:52pm

Good question.

Not really sure without knowing more. Lots of testing and analytics most likely.

Topic		Replies	Views
Inconsistencies in API response to same prompt and similar content API gpt-4 , gpt-35-turbo , api	3	5153	July 18, 2023
New gpt-4-turbo-preview saying it can't help on complex prompt Prompting gpt-4 , api , gpt-4-turbo	7	2603	January 29, 2024
GPT-4 becoming dumber sometimes, for a while API	7	2730	December 18, 2023
Do you also relate or did you overcome this challenge? Prompting	5	1646	March 3, 2024
Custom chatbot says that it's developed by OpenAI API gpt-4	33	2233	April 2, 2024

Why is ChatGPT so moody, refusing to answer the exact same question it answered previously?

Related topics