Hello there,
I came across the website “chatpdf .com” and noticed its capability to upload PDF files, allowing the GPT model to analyze content, retain information, and provide answers to queries regarding the documents.
In my attempt to replicate this functionality, I tried transcribing PDFs and uploading the transcriptions to the GPT-3 API using both Chat Completions and Completions. I divided the documents into segments to avoid exceeding the token limit per message (using Chat Completions). However, when I attempted to upload a substantial book, I encountered an error.
As I am posing this question on the forum, I currently do not have access to my code, and I am unable to recall its current status. After some research, I realized that I might have been uploading too much text at once. It seems there is a limitation on the number of tokens per context. This led me to wonder how the mentioned website manages to upload a significant amount of information without surpassing the token limit within a conversation.
I have been grappling with this issue for quite some time and have not come across any definitive solutions. Would the optimal approach involve fine-tuning? Honestly, I’m unsure. Could someone provide guidance on this matter? I would greatly appreciate any assistance.