Hello, I’m building a MVP with assistants.
Created a Voice assistant for specific purposes,
I have 1 text file uploaded which is 77k characters (11400 Words) long and is stored in the vector storage.
My input (instruction) prompt is around 500 Tokens, but whenever Assistant needs to search this text file for information it spends from 17 to 20k Tokens.
Is this normal or do I have a terrible leak somewhere?
P.S Using GPT4 Turbo
The built-in retrieval mechanism is very greedy and it will grab anything and everything it thinks might be even remotely relevant.
At this point, this behaviour should be expected, there really isn’t anything you can do to mitigate this short of going through and trimming the fat from your uploaded document in the hopes there will be fewer tangentially related chunks available for it to ingest.
It’s a bit counter-intuitive and going against the grain of embedding documents to go from Text → PDF. Especially if you just saved it without performing any work on it. I would actually that there’s something wrong here.
Do you mind sharing this document? From 17k → 1k tokens is quite an accomplishment if the results are just as accurate.
Or, at the least, what kind of document was it? I could maybe see a table document maybe performing better if the text isn’t baked into the document and is better to be read row-by-row.
Idk if my math is right here but 1 word is usually around 1.33 tokens. If your document is 11,400 words then that would mean the complete document is ~15,000 tokens.