Hello, i’m trying to use Chatgpt to get answers from a book, the book is sliced into sentences each sentence having a unique id, and stored as csv. it’s about 15mb so attaching it to the prompt is not an option, as i’m planning to run it hundreds of times a day.
for example here’s a sentence from the book
He hung his hooded cloak on the nearest peg, and “Dwalin at your service!” he said with a low bow
when i ask this question"what should i do after hanging my clothes at the door?" chatgpt should give me the id of this sentence because Dwalin hangs his cloth at the peg and says something, which is relevant to my question.and other similar sentences that may seem relevant to my question inside the book.
the question itself is not optimized to target that specific sentence in the book, it’s just a random daily question. but i want to find sentences that seems relevant to the question.
so far i have tried multiple things, such as embedding the book then extract relevant answers using cosine_similarity, but i don’t get any good results. then i tried long-text instead of short-text, the results became somewhat acceptable but still i couldn’t get predictable results like trying to get a sentence from the book even by asking targeted questions.
then to my last resort, i embedded the sentences again but this time i used chatgpt to extract the key points of each sentence, then embed those keywords, such as
input: ““Excuse me!” said the hobbit, and off he went to the door.”
output: “pardon hobbit say, open door”
but that gave me even worse results.
is there any other way i can achieve this without attaching the whole file to each prompt?