Feed a large size data to ChatGPI API

I aim to provide a substantial knowledge base, approximately 200 pages in size, to the ChatGPT API and then query it for specific information. My goal is to have ChatGPT locate the answers within this knowledge base. However, a challenge arises due to the limited input message capacity of the API.

How can i do this?

Hi there and welcome to the Dev Forum!

The most effective way for Q&A type of use cases is to convert your knowledge into embeddings and then use RAG for the Q&A. If the concept is new to you, then you can get a good first technical overview complete with a worked example here.

If you are looking to limited your own efforts in building out a RAG pipeline, then you should consider OpenAI’s Assistant, where much of the heavy lifting is done for you. You can upload files and they are automatically chunked, converted into embeddings and saved to a vector store and then you can perform Q&A on the stored data. You can get an introductory overview of Assistants here and the deep dive on the file search here.

I hope this helps to get you started.

Feel free to let us know if you have more specific questions.

1 Like