Prevent hallucination with gpt-3.5-turbo

I went to your site and asked some questions like
-who was the last samurai
-who are you?
I can say if i was a customer i could be mad with the answers. :grinning:. Sorry for that. But i found out one thing.

It need to be well engineered on instructions.
For example, it could have told me “am assistant or chatbot or anything” designed to help or assist… Some kind of engagement with customers… Share with me your prompts at

josewambura22@outlook.com let me try to prompt that.

is there any way that i send prompt instuction only one time
and work same every time whenever i hit openai api

No way.

OpenAI requires prompt and chat history to predict the next token. Since it is not made to remember anything, rather than complete a prompt based on what user send to it. Thats why we save history out of openAI, but we summarize it with instructions and then send it to openAI, and it will act like it is remembering but in reality its what you are sending currently.

BeeHelp just has been designed to be provided with a collection of Q&A about your service/product and properly answer to visitors searching information about your service/product.

You cannot avoid that someone enter on your “commerce” and begin to ask stupid things about samurais. You neither cannot avoid that people do the same with a chatbot.

My main concern, is that the chatbot provide CORRECT INFORMATION about my product/service when arrive someone needing info about it.

Believe me, my Q&A-chatGPT is running quite well. I’ve been intensively testing the system with about 500 real visitors the last 2 months.

The disappointing thing is not that he talk about samurais. My concern is about he saying i have a “customer support” email account that i don’t have!

Anyway, thanks for your suggestions.

I’ve developed my own “text preparer to calculate embedding”. My humble opinion is that the usual system of make text chunks with cuts based on “text length” is not very useful. So i calculate embeddings of titles, paragraphs, element lists, etc… Which will be let to retrieve more accurately the relevant information for the user question.

I’ve not done tests comparing both systems, but i’ve been using it for 2 months and by now it works excellent. Obviously, is not so easy to summarize here, but this is the central question of my embedding system.

Anyway, thanks for your link.

Note: the embedding system used has not relevance with hallucination. Reading some of your comments i would say that you are using bad the term “hallucination”. Asking to the chatBing, it says:

In the context of text generation, hallucination refers to mistakes in the generated text that are semantically or syntactically plausible but are in fact incorrect or nonsensical. In other words, you can’t trust what the machine is telling you.

Cheers.

Have you tried to use GPT-3.5 answer as prompt (in another instance of GPT-3.5 chat) asking if there are any hallucinations in it?

So I realize I’m late to the party but maybe I can provide some insight as to why instructions like this don’t work and probably only sometimes worked with Davinci.

The issue is these models have read the entirety of the internet and feel that they always have an opinion worth expressing on a given subject. The only way to get them to truly dodge hallucinations is to present them with a set of grounding facts in the prompt and then ask them the question “are all the facts needed to answer the users question in the text above?” You can then get them to provide an answer that’s better grounded in fact. Even then, there are some hallucinations they can’t see past, even GPT-4.

Instead of packing this info into a single “system” message, you’ll get more milage by including a series of example pairs as “user”+“assistant” messages. So include an example user message of “what is the cost of the service?” followed by an assistant message of " You can enjoy a free account with certain limitations, and then upgrade to a PREMIUM account for unlimited use." You could store these examples in a local vector database and then use semantic search to find the top 5 examples to include in your prompt.

To prevent the model hallucinating an email account, may be you can give it a real one.

1 Like