How to Limit question results to proprietary dataset?

One thing that I can mention here is that there is the concept of word embeddings. This is where you take the data that is available for your business idea or content And you create word embeddings. You then store that information in a Vector database in the form of word embeddings. What they are basically word tokens that map to vectors which are spacial mappings to the model’s token positioning. It is a lot to mention here but then you can search your db for the context and then if it is found then you pass the same search to openai chatgpt passing also the embeddings in your local db to the same prompt. So it then give you granular control over the responses provided. If they go off topic, because it is not in your context, you you respond with some answer that says that. So basically you take your context convert it to openai’s word embeddings and then save that. Then you have a subset of their model that you run the prompts through. You will need to do more research into this to understand this and learn how to do this. Here is a link: How to prevent ChatGPT from answering questions that are outside the scope of the provided context in the SYSTEM role message? - #3 by caos30

1 Like