Does fine tuning improve gpt3.5/4 retrieval speed?

I am using gpt-4 with assistant and retrieval enabled, I find it can take up to 40 seconds to respond when asked about something in the attached files, I have 12 attached pdf files, not very large files they have maybe 4 pages each.
I am wodering if i fine tuned the model to give it more information about the files that maybe I wont need to attach the files and it would increase the speed.
I have tried converting all of the pdfs to just one text file and uploading that but it hasnt improved the speed.

Has anyone had experience with this?

1 Like

There has been a common misconception about what fine-tuned models do. Fine-tuning a model doesn’t give it new knowledge, but rather learns the writing style you are giving it. Feeding it pages of PDFs (ie. chunks of text) will likely do nothing useful, and maybe even decrease the model’s quality if you don’t know how to do it.

If you want it to learn new information, Embeddings is the way to go. It’s slow because it performs a similarity search on chunks of your text (similar to Google, for instance), then attaches the results to the beginning of your prompt and asks it to respond; This takes a little while, as you can see. It’s not 100% foolproof, but for now it’s the way to go.


Thanks for the reply, I will try out embeddings.

1 Like

If you’re using Retrieval, it’s already using Embeddings behind the scenes.

However, you can implement it yourself, with LangChain (requires coding experience)! Bonus points as you’ll have much more control over it.