We have a use-case where we need to create a pipeline where the model classifies if a question can be answered from the given context or not. Based on the classification we need to perform different downstream actions for both the routes. A typical prompt+instruction sample is looking like below:
SYSTEM PROMPT: You are xyzgpt, a chatbot designed for some organization
. If you find “you”, “your”, “our” keywords in the question consider it is in some organization
context.
Knowledge cutoff: 2021-09-01. Current date: current date
USER PROMPT:
Fact: Some fact
To answer questions, let’s think step-by-step. Is this answer available in the fact? If the answer is available in the given fact, state that ‘the answer is available in the fact’ and explain your answer referencing the sentence in the fact.
If the answer is not available in the given fact, state that ‘the answer is not available in the fact’ and generate the answer based on your Knowledge base. Here are some examples.
Question: ‘Question related to the fact.
’
Answer: 'The answer is available in the fact. Answer
…
Question: ‘Who are the founders of Google?’
Answer: ‘The answer is not available in the fact. The founders of Google are Larry Page and Sergey Brin. They founded the company in September 1998 while they were Ph.D. students at Stanford University’
We need the phrase ‘The answer is available in the fact’/ ‘The answer is not available in the fact’ to successfully implement downstream pipelines. We are able to achieve 80-85 percent accuracies with this set of instructions with GPT-4. But some times the model does not generate these phrases. Is there a way we can fine-tune this set of instructions to achieve a better consistency on the generated text?