How to prevent Open AI from making up an answer

I am building a QnA chatbot. To train the chatbot, I uploaded a document that has process names such as FindCustomer, ContactCustomer, RecordPayment, VoidCharge, AddFlowSheetValue etc along with a few lines describing what each process does.

Now when I ask what information is needed to to add a flowsheet value, Curie correctly identifies the process AddFlowSheetValue as document 0 shown below:

document: 0,
object: ‘search_result’,
score: 164.705,
text: ‘AddFlowsheetValue This service receives the following data: Customer, Contact, User, Flowsheet row, Value, Comment, Instant, Flowsheet template. The service then files the flowsheet value and comment to the row on the template for the appropriate patient contact.’

However, the response I get from openAI API is:
answers: [
‘The following values can be provided while adding a flowsheet value:’,
‘The following values can be provided while adding a flowsheet value:’

Similarly, if I ask “How to cancel a charge router record”, if finds the following:
document: 0,
object: ‘search_result’,
score: 351.297,
text: ‘VoidCharges this web service voids a charge router record’

However the answer from API is a fabricated process as I get a response:
“You can cancel a charge router record by calling CancelChargeSession”

There is no process called CancelChargeSession in my training document.

My questions:

  1. How do I make sure that the answer returned used document 0? Can I use a threshold value for score
  2. How do I prevent open AI from making imaginery processes such as CancelChargeSession?

I made a video about this


Very helpful. We plan to use Curie so I will first try with that. If that does not work, I will try with davinci. Will post the results here for everyone’s benefit.

1 Like

Dave- A clarification I am requesting. Our’s is a QnA bot. In the video, for training, you have used “Completion” and not QnA. Can you please explain why?

I don’t like the Answers endpoint. It doesn’t work that well in my opinion.

Thanks for the answer Dave.

Dave- I wanted to check if the following is correct interpretation of your video for solving two problems we are facing with our chatbot:
A. Preventing OpenAI from confabulating
B. Preventing OpenAI from responding using any outside info.


  1. We use 0 temp.
  2. We have toughly 600 web services. For some of the APIs the documentation includes input and return parameters.
  3. We use GPT to answer 3 questions (Is there an API do do X, What are the input parameters for API X, what is the return value for API X)
  4. For the first question, to prevent it from making up an answer, when we are ask question “how to get list of customers”, we will keep the correct answer “Use GetCustomerAPI” because we know we have GetCustomerAPI. However, we ask “How to get a list of customer by age”. GPT would respond and we change that answer to “This API does not exist.” We do this because we don’t have an API to give customer by age.
  5. We do the same for rest two questions. When the input and return parameters are available, we leave that as such. However, when such parameters are missing, we modify the answer to “We currently dont have that information”.
  6. Using data from 4 and 5, we train the model. In this case maybe we can use all 600 APIs docuemnts.

That’s an interesting use. It’s basically a matching problem, so I’m wondering why you’re using GPT-3 for it? Either way, you’ll probably get better results by just using the EMBEDDING endpoint and calculating the dot product. You can match between multiple descriptions of endpoints and what the customer is asking for. I use this technique for neural memory search in my latest ACOG video. Link coming shortly

The reason we are using GPT (it may still be incorrect use) is because the use case I provided above is not the entire use case. For example, we want to be at a level where when user asks how to get a list of my customer above certain age, the bot is supposed to say, get a list of all the customer by GetCustomer API and then check age of each customer by GetCustomerDOB API. So even though no API exists to do it in one shot, the user can get to it by combining two APIs.

One other use case is that a user asks “does product X integrate with CRM and yes then which one”. We want the bot to answer “X integrates with Salesforce but not with Dynamics.”

You might be onto something. Helping users identify your services is critical. Here’s how I would approach this problem:

Generate a bunch of synthetic data about chats that include customers seeking services and the chatbot asks questions to help identify their needs. This chatbot would be step one. Once you have that conversation, you can train a matching algorithm to the APIs.

I have several videos about creating synthetic data and chatbots. My education chatbot will be most relevant.

1 Like

Thanks. Will look into it.

Dave- Thanks for all the help. Based on our exchange, I think I know what needs to be done. I have summarized that below with a set of questions. Based on your response, maybe I will create a “Productize OpenAI” guide to help the community.

Objective: Create a chatbot that can answer questions related to 600 APIs. Usually there would be two questions:

  1. User: How do i do X? Bot: To do X, use API API_NAME
  2. User: What inputs are needed for API_NAME? Bot: The input parameters are Input1, input2, input 3.


  1. Take a list of 200 APIs and their associated details and create 200 contexts.
  2. Ask the two questions as above and get the answers to create synthetic data. With two questions each we will have 400 responses for finetuning. Use temp as ZERO to ensure that answers are determinstic.
    3.If the response created by GPT is incorrect then manually modify it. After such modifications have been done, as needed, use it to finetune.


  1. We only used data from 200 APIs to finetune. How do I supply rest 400 APIs as knowledge base?
  2. How to stop confabulation? For example, there is no API to make tea. But if I use the approach as above, and a user asks “how to make tea”, the bot would response one of the two ways: 1. Use MakeTea API or 2. Put water in a pan on stove…
  3. Same as Qu#2 for input parameters. Due to shoddy product management, lets say one of the API description is missing input parameters. In such cases, the Bot should respond, “The documentation lacks this info. Please contact the product manager”.
  1. You can use templates to synthesize the data for the remaining 400 samples

  2. You just include many negative examples (adversarial) that lead to dead ends (see the adversarial examples in my second tutor chatbot video)

  3. Same answer as 2

1 Like

David- Can you please shed more light on #1? So lets say I have 600 APIs.

  1. I use 200 APIs to train the model. One such API is “CleaningRequestComplete”.
  2. In the remaining 400APIs, there is an API “CookingRequestComplete”.

How do I use the template so that the model knows about “CookingRequestComplete” and gives the correct answer while if someone ask how to mark “PictureHangingRequestComplete” it answers that the answer to that question is beyond the scope (Lets say there is no such API).

It would take many pages to explain that via text, but I’ve already got that answered several times. Also, to be fair, I do not think your approach is feasible. You’ll have better luck by copying Alexa’s skills approach, which is infinitely extensible and you can use the Embeddings endpoint to match - which would be much much much simpler. Essentially what you’ve got here is a search problem but you’re treating it like a chat problem. You might also look up matching algorithms such as recommender systems like Netflix and Amazon.

You’ll want to watch some of my videos about how to generate synthetic data. In most of my finetuning videos, I demonstrate how to create synthetic data. Here are a few good videos with associated code in the description: