How to prevent Open AI from making up an answer

chinmay.duke · May 23, 2022, 7:43pm

I am building a QnA chatbot. To train the chatbot, I uploaded a document that has process names such as FindCustomer, ContactCustomer, RecordPayment, VoidCharge, AddFlowSheetValue etc along with a few lines describing what each process does.

Now when I ask what information is needed to to add a flowsheet value, Curie correctly identifies the process AddFlowSheetValue as document 0 shown below:

document: 0,
object: ‘search_result’,
score: 164.705,
text: ‘AddFlowsheetValue This service receives the following data: Customer, Contact, User, Flowsheet row, Value, Comment, Instant, Flowsheet template. The service then files the flowsheet value and comment to the row on the template for the appropriate patient contact.’

However, the response I get from openAI API is:
answers: [
‘The following values can be provided while adding a flowsheet value:’,
‘The following values can be provided while adding a flowsheet value:’
],

Similarly, if I ask “How to cancel a charge router record”, if finds the following:
document: 0,
object: ‘search_result’,
score: 351.297,
text: ‘VoidCharges this web service voids a charge router record’

However the answer from API is a fabricated process as I get a response:
“You can cancel a charge router record by calling CancelChargeSession”

There is no process called CancelChargeSession in my training document.

My questions:

How do I make sure that the answer returned used document 0? Can I use a threshold value for score
How do I prevent open AI from making imaginery processes such as CancelChargeSession?

daveshapautomator · May 23, 2022, 9:34pm

I made a video about this

chinmay.duke · May 23, 2022, 11:21pm

Very helpful. We plan to use Curie so I will first try with that. If that does not work, I will try with davinci. Will post the results here for everyone’s benefit.

chinmay.duke · May 29, 2022, 3:41pm

Dave- A clarification I am requesting. Our’s is a QnA bot. In the video, for training, you have used “Completion” and not QnA. Can you please explain why?

daveshapautomator · May 29, 2022, 4:57pm

I don’t like the Answers endpoint. It doesn’t work that well in my opinion.

chinmay.duke · May 29, 2022, 5:14pm

Thanks for the answer Dave.

chinmay.duke · May 30, 2022, 6:38pm

Dave- I wanted to check if the following is correct interpretation of your video for solving two problems we are facing with our chatbot:
A. Preventing OpenAI from confabulating
B. Preventing OpenAI from responding using any outside info.

Steps:

We use 0 temp.
We have toughly 600 web services. For some of the APIs the documentation includes input and return parameters.
We use GPT to answer 3 questions (Is there an API do do X, What are the input parameters for API X, what is the return value for API X)
For the first question, to prevent it from making up an answer, when we are ask question “how to get list of customers”, we will keep the correct answer “Use GetCustomerAPI” because we know we have GetCustomerAPI. However, we ask “How to get a list of customer by age”. GPT would respond and we change that answer to “This API does not exist.” We do this because we don’t have an API to give customer by age.
We do the same for rest two questions. When the input and return parameters are available, we leave that as such. However, when such parameters are missing, we modify the answer to “We currently dont have that information”.
Using data from 4 and 5, we train the model. In this case maybe we can use all 600 APIs docuemnts.

daveshapautomator · May 30, 2022, 8:55pm

That’s an interesting use. It’s basically a matching problem, so I’m wondering why you’re using GPT-3 for it? Either way, you’ll probably get better results by just using the EMBEDDING endpoint and calculating the dot product. You can match between multiple descriptions of endpoints and what the customer is asking for. I use this technique for neural memory search in my latest ACOG video. Link coming shortly

chinmay.duke · May 30, 2022, 9:25pm

The reason we are using GPT (it may still be incorrect use) is because the use case I provided above is not the entire use case. For example, we want to be at a level where when user asks how to get a list of my customer above certain age, the bot is supposed to say, get a list of all the customer by GetCustomer API and then check age of each customer by GetCustomerDOB API. So even though no API exists to do it in one shot, the user can get to it by combining two APIs.

One other use case is that a user asks “does product X integrate with CRM and yes then which one”. We want the bot to answer “X integrates with Salesforce but not with Dynamics.”

daveshapautomator · May 30, 2022, 9:58pm

You might be onto something. Helping users identify your services is critical. Here’s how I would approach this problem:

Generate a bunch of synthetic data about chats that include customers seeking services and the chatbot asks questions to help identify their needs. This chatbot would be step one. Once you have that conversation, you can train a matching algorithm to the APIs.

I have several videos about creating synthetic data and chatbots. My education chatbot will be most relevant.

chinmay.duke · May 30, 2022, 11:43pm

Thanks. Will look into it.

chinmay.duke · May 31, 2022, 6:27pm

Dave- Thanks for all the help. Based on our exchange, I think I know what needs to be done. I have summarized that below with a set of questions. Based on your response, maybe I will create a “Productize OpenAI” guide to help the community.

Objective: Create a chatbot that can answer questions related to 600 APIs. Usually there would be two questions:

User: How do i do X? Bot: To do X, use API API_NAME
User: What inputs are needed for API_NAME? Bot: The input parameters are Input1, input2, input 3.

Steps:

Take a list of 200 APIs and their associated details and create 200 contexts.
Ask the two questions as above and get the answers to create synthetic data. With two questions each we will have 400 responses for finetuning. Use temp as ZERO to ensure that answers are determinstic.
3.If the response created by GPT is incorrect then manually modify it. After such modifications have been done, as needed, use it to finetune.

Questions:

We only used data from 200 APIs to finetune. How do I supply rest 400 APIs as knowledge base?
How to stop confabulation? For example, there is no API to make tea. But if I use the approach as above, and a user asks “how to make tea”, the bot would response one of the two ways: 1. Use MakeTea API or 2. Put water in a pan on stove…
Same as Qu#2 for input parameters. Due to shoddy product management, lets say one of the API description is missing input parameters. In such cases, the Bot should respond, “The documentation lacks this info. Please contact the product manager”.

daveshapautomator · June 1, 2022, 12:21am

You can use templates to synthesize the data for the remaining 400 samples
You just include many negative examples (adversarial) that lead to dead ends (see the adversarial examples in my second tutor chatbot video)
Same answer as 2

chinmay.duke · June 5, 2022, 4:31am

David- Can you please shed more light on #1? So lets say I have 600 APIs.

I use 200 APIs to train the model. One such API is “CleaningRequestComplete”.
In the remaining 400APIs, there is an API “CookingRequestComplete”.

How do I use the template so that the model knows about “CookingRequestComplete” and gives the correct answer while if someone ask how to mark “PictureHangingRequestComplete” it answers that the answer to that question is beyond the scope (Lets say there is no such API).

daveshapautomator · June 5, 2022, 5:23am

It would take many pages to explain that via text, but I’ve already got that answered several times. Also, to be fair, I do not think your approach is feasible. You’ll have better luck by copying Alexa’s skills approach, which is infinitely extensible and you can use the Embeddings endpoint to match - which would be much much much simpler. Essentially what you’ve got here is a search problem but you’re treating it like a chat problem. You might also look up matching algorithms such as recommender systems like Netflix and Amazon.

You’ll want to watch some of my videos about how to generate synthetic data. In most of my finetuning videos, I demonstrate how to create synthetic data. Here are a few good videos with associated code in the description:

Topic		Replies	Views
Embeddings not preventing OpenAI from answering API	25	3063	December 19, 2023
Send me your GPT problems, I'll solve them for free and make a YouTube video Community	77	8011	January 3, 2024
How to fine tune a chatbot for Q&A API	12	8279	December 16, 2023
Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games API fine-tuning	4	1520	October 4, 2023
What to do when fine-tuning is not working? API	21	7854	December 24, 2023

How to prevent Open AI from making up an answer

Related topics