Looking for help with example context and the answer results that I receive

I’m just getting started with OpenAI and my use case is I want to upload a dataset, and then be able to ask questions about the information within that data. I have it working fairly well, but there are a few things that I’m wondering about still and I’m hoping someone here has some insight.

  1. If I ask a question and I pass some example contexts along with it, do I need to keep passing those same example contexts for all subsequent request, or does it ‘remember’ them moving forward? The impression I get from the help docs is that I need to keep sending them, but using the API it seems like I’m able to delete the contexts I’ve already passed to it, and still achieve the same quality of answer for my questions. It seems weird because before I initially added those contexts, I would not get type of answers I was looking for.

  2. If I ask a question multiple times in a row, it will give me different answers each time but if I ask a question once, then ask some different questions, then ask my initial question again, I seem to get the same answer as the first time that I asked it. Why? Is it because when the same question is asked twice in a row, the first answer is assumed to be incorrect?

  3. Sometimes, I will be given an answer that includes information that did not come from the data that I uploaded. I’m assuming in these cases it’s referencing information from other sources (correct me if I’m wrong though!). Is there a way to prevent that? I would like answers to be fully based off the information that I provide, and not other stuff from the web.

1 Like
  1. Yes you have to include it in every request, I suggest finetuning the model instead if you want to exclude examples (prompt examples become training data instead).
  2. Can you clarify your question with the input/output examples you’re using in this case?
  3. The model is memorizing a lot of info through it’s parameters during pre-training. It might be better to prompt the completion model to explicitly ‘re-write’ the data retrieved by the search model, instead of simply generate text with the data loaded into its context.
1 Like

Hi Ali, thanks for the response.

  1. Got it. I think we are okay sending it with each request for now, but I just wanted to understand if I needed to or not.

  2. I think this one is resolved for me. I was getting different answers for the same question previously, but I noticed some errors in my example contexts. I’m not sure if they were causing the issue but after cleaning them up I am getting consistent answers when asking the same question multiple times.

  3. I’m not quite sure I follow you on this part

“prompt the completion model to explicitly ‘re-write’ the data retrieved by the search model, instead of simply generate text with the data loaded into its context.”

Do you know of any examples that show demonstrate this type of approach?

  1. You can break the problem into two steps so that the model isn’t injecting info external to the knowledge-base. First step is to retrieve most relevant info from the knowledge-base using the search model, then load the data into the prompt of a completion model with instructions to generate an answer using the loaded data only.
1 Like