Is there a way to control the documents openAI uses

Is it possible to restrict OpenAi to lets a Zotero library?

It is not possible at this time with the GPT model.
In fact it is very hard to do with any large language model.
There are two options…

Option 1: Use open source model such as Bloom and fine tune on a data from Zotero.
Pros: The results of your prediction will be tailored to data from Zotero, but it is not 100% guarantee.
Cons: You will need multiple GPU to train for weeks and a ML engineer to build the training for you. Very expensive in man hours and compute hours.

Option 2: Upload all the data from Zotero into a database that supports embedding such as Superinsight.ai. Use the the semantic search capabilities to find the contents most similar to the input and use data found in the semantic search as the prefix to your prompt so GPT3 has some predefined context to work with.
Pros: You can get much better results using this method and it’s a lot cheaper and easier than fine tuning a large language model. Superinsight is free for researchers to use and it runs on CPU.
Cons: There are still some technical setup needed to get this setup, but it’s something most developers can handle.

Hope that helps!

1 Like