Why does hallucination increased after the fixes made on 16th Oct

shrimad.mishra · October 18, 2023, 12:33am

For a given use-case of mine, I had formulated a prompt which also used function calling in it. This prompt was working perfectly fine until the 16th Oct night. After the fix which was made by OpenAI due to the elevated error rate, I have noticed that the exact same prompt has stopped working. Hallucinations have increased drastically.

Is anyone aware of the reason for this? And is there any other possible solution for this?

_j · October 18, 2023, 12:38am

what “fix”?
what model?
what prompt?

The answer is basically “there is only one function-calling model, and we will hit it with whatever quality degradation we feel like, without comment. Good luck with that.”

shrimad.mishra · October 18, 2023, 12:44am

I am using gpt3.5-turbo-16k-0613 model.
For reference, I am attaching the prompt:-

Prompt:-

The customer is interested in buying the iPhone 14 phone, now you need to perform the below steps one at a time:
1. You need to state the benefits of the Apple Care Plus product which are mentioned in triple backticks.
2. Ask the customer if they are interested in buying the Apple Care Plus product along with iPhone 14.
Apple Care Plus Benefits:
```
    Extended warranty
    Door-to-door service
    Priority Service
```

Always ensure that if the customer is interested in buying Apple Care Plus then you must call the interested_in_applecare_decision function.
Always ensure that if the customer is not interested in buying Apple Care Plus then you must call the not_interested_in_applecare_decision function.
Always ensure that if the customer is not interested in buying the iPhone 14 phone then you must call the not_interested_buying_iphone_decision function.

This was working fine but after the fix it is not, why is this happening and how can be fixed

_j · October 18, 2023, 12:54am

First, there is no reason to pay double if you don’t need the context length of -16k. If it is really possible for you to exceed -4k with your application, you can make a more intelligent model-select mechanism in your software.

You should find a way to minimize the system prompt. The functions should have a clear description and names so they can stand alone on their own merit.

gpt-3.5 models have been hit with quality degradation for following system instructions going back a month. You cannot fix this. (They broke GPT-4, and thus had to go after gpt-3.5 that still worked and embarrassed the 30x more expensive model?)

shrimad.mishra · October 18, 2023, 1:04am

Thanks, yes in our use case it is possible to exceed 4k.

Okay will minimise the prompt but why suddenly does this stop working?

_j · October 18, 2023, 1:16am

It stops working because OpenAI continues to integrate new fine-tuning into models and continues to “optimize” them to give them the minimum appearance of operating properly with minimum computation.

shrimad.mishra · October 18, 2023, 1:20am

Okay, So If one prompt is working correctly now and if there are any new fixes made to the model then there are high chance that the prompt that was working before should stop working now.

_j · October 18, 2023, 1:34am

Yes, you need to make your system message, functions, and user message encapsulation as clear, simple and distinct as possible, so that it is robust against any further model tuning, and not already running at the edge of AI ability to understand what’s going on.

You can also get more reliable operations by including an API parameter such as top_p = 0.3 (or lower) so you are only generating the most likely path of language completion, to avoid unexpected tokens even when model perplexity increases.

shrimad.mishra · October 18, 2023, 1:39am

Thanks for the help will try with top_p = 0.3

Topic		Replies	Views
GPT3.5 Turbo downgraded suddenly? API	6	1607	November 14, 2023
Facing inconsistent responses after yesterday fix made API	1	443	October 18, 2023
Has regular gpt-4 model changed for the worse by any chance? Community gpt-4 , hallucinations	12	1847	April 23, 2025
GPT 4o mini took a hit ever since o1 was released API gpt-4	10	949	September 18, 2024
GPT3.5 returning incorrect data API chatgpt	7	2458	December 19, 2023

Why does hallucination increased after the fixes made on 16th Oct

Related topics