Gpt-4o hallucinates a lot

One of the reasons may be that the information to be referenced is contained in both the system message and the user message.

So, I specify which sections in the system message to reference: the ‘CONTEXT INFORMATION’, the ‘Expected Response’ from the user message, and the ‘Important Section’.

Because of this, I’m not seeing hallucinations in the example shared in the link anymore.

I’m not sure how this will work with other CONTEXT INFORMATION, and the problem may not be completely solved until the model itself is improved.

But this change in the system message seems to have made a good difference, so I’ll share the link to the results.

https://platform.openai.com/playground/p/CJ2K3xvvVKWGlWrgzg63tJrK

I hope this can be of some help🙂

2 Likes

Hi, I checked your preset. It modifies the message too much. Can you only modify the system prompt without changing the rest of the user messages and the assistant message to fix the last message of the assistant? I added what you proposed in the system prompt and still have the same hallucination issue.

https://platform.openai.com/playground/p/yEwza9Ibnw3RnQxGNnTwzOqd

Changing anything other than the system prompt is like cheating fix.

This is a production situation that my clients have this issue. So we can only fix the system prompt, not the user message in the middle.

Sorry, due to the completely different datasets of the model, there might be limitations with ad-hoc prompting😥

I just thought that the hallucinations in the responses returned by GPT-4o might be the cause.

While it may be wise to carefully consider whether to use GPT-4o in production, the best workaround I found was to delete the assistant message starting with “Ja, vi har to butikker i København:” which contains hallucinations, and then re-ask the question.

Sometimes I hesitate it’s just GPT-3.8, hallucination really strong!
But for it’s speed, no better choice.
U can try to put your emphasize words in different position, such as at the beginning or ending;
U can try to describe the outline rule first, emphasize DON’T things with upper words, then give detail demands!
GPT-4o’s instruction following really disappointing!

1 Like

The recently announced gpt-4o mini did not seem to have such hallucinations.

https://platform.openai.com/playground/p/i5h0x8LdL2SyQEo2FtgQzT28

1 Like

yes. But GPT-3.5 doesn’t have such hallucination month ago when I post this thread. When API make hallucination mistakes in one case, other model might not make. When other model make mistakes, this model might not have.

1 Like