I have several text files containing the transcripts of conversations between various people.
i.e.
Person1: This is what they said
Person2: This is what the other person said
Person3: I have a question. This is my question?
Person1: This is the answer
The conversations can be varied, about a range of topics (it is from a parliament) - and I want a tool whereby politicians can ask questions about proceedings.
These text files are about 50KB in size.
Using ChatGPT, I can paste the contents of these files in chunks, into the web interface. I can then ask ChatGPT to answer questions or summarise the contents.
How can I replicate this behaviour using the API?
I have tried a few different approaches, and even asked ChatGPT - but have made no real progress!
Your going to be limited by the context length so if the text is small enough you can directly bring in a text file and with the right prompts get answers. For files that exceed the context length you have to chunk the file and store the information into a new database this guy has a good example using pinecone GPT-4 Tutorial: How to Chat With Multiple PDF Files (~1000 pages of Tesla's 10-K Annual Reports) - YouTube. Its also possible to use the API to generate a knowledge graph from the data locally and use that with context pointers to interrogate the data. The final method which I haven’t explored yet is to fine-tune train a model on your particular data set. You would have to look into that more, could be a viable option.