How to Limit question results to proprietary dataset?

AgusPG · March 14, 2023, 4:15pm

Several proposals. In my experience, you get the best behavior when you actually combine all of them:

Clearly specify the questions that should not be answered via prompt-engineering. Stuff such as “You should always refuse to answer questions that are not related to this specific domain” should help a lot.
Include binary classifiers that determine whether a question is “on-topic” or “off-topic” for your particular use case. You can use cheap fine-tuned OpenAI models for this or open source stuff (Huggingface).
Include a minimum threshold of similarity when retrieving documents to answer questions. If no documents surpasses this threshold, decline to answer politely (with a pre-specified formula).
Use content moderation (OpenAI’s free endpoint) to filter out inappropriate requests.
Include reg-exp filtering to add a extra security layer to stuff such as prompt-injection (especially if you’re exposing your app to external customers).
Probably many others

Hope that helps!!

Topic		Replies	Views
How to avoid answers like 'yes...' or 'no...' and force to expatiate with more related info API	5	1226	March 9, 2023
Fine tuning a model for customer service for our specific app Prompting	23	13786	May 14, 2024
Is it possible to fine-tune a model to answer questions given a raw text? Prompting	18	10084	December 15, 2023
Fine-tuning with Contextual Information Beyond Prompt-Response Pairs: Possible? API question , fine-tuning , beginner	11	1257	June 29, 2024
Chatbot with a fine-tunned model API	6	1042	December 27, 2023