I am exploring integrating GPT into our own chat sessions based on our own custom knowledge base.
I’m using LlamaIndex to generate an index of the custom knowledge base in my company, then asking questions about the material. This works well, but it still answers unrelated questions. For example if I use the index to query I can say “What is the diameter of the moon?” and it will still answer it.
Obviously I don’t want these type of questions to be answered in our business chat sessions. How exactly do I limit the scope of the conversation? It should instead say something like “Based on my research i’m not able to find an answer to your question” or something like that which I can program.
Several proposals. In my experience, you get the best behavior when you actually combine all of them:
- Clearly specify the questions that should not be answered via prompt-engineering. Stuff such as “You should always refuse to answer questions that are not related to this specific domain” should help a lot.
- Include binary classifiers that determine whether a question is “on-topic” or “off-topic” for your particular use case. You can use cheap fine-tuned OpenAI models for this or open source stuff (Huggingface).
- Include a minimum threshold of similarity when retrieving documents to answer questions. If no documents surpasses this threshold, decline to answer politely (with a pre-specified formula).
- Use content moderation (OpenAI’s free endpoint) to filter out inappropriate requests.
- Include reg-exp filtering to add a extra security layer to stuff such as prompt-injection (especially if you’re exposing your app to external customers).
- Probably many others
Hope that helps!!
Yes, thank you very much, I will explore all of these options!
One thing that I can mention here is that there is the concept of word embeddings. This is where you take the data that is available for your business idea or content And you create word embeddings. You then store that information in a Vector database in the form of word embeddings. What they are basically word tokens that map to vectors which are spacial mappings to the model’s token positioning. It is a lot to mention here but then you can search your db for the context and then if it is found then you pass the same search to openai chatgpt passing also the embeddings in your local db to the same prompt. So it then give you granular control over the responses provided. If they go off topic, because it is not in your context, you you respond with some answer that says that. So basically you take your context convert it to openai’s word embeddings and then save that. Then you have a subset of their model that you run the prompts through. You will need to do more research into this to understand this and learn how to do this. Here is a link: How to prevent ChatGPT from answering questions that are outside the scope of the provided context in the SYSTEM role message? - #3 by caos30