Embeddings vs finetunes

The project is an “expert” bot. I’ve got a guideline document that the bot is supposed to answer questions about. I’ve created embeddings for the document, and I embed new questions, and compare and then when the comparison is done, I return the section of the document to davinci (or more precisely, to text-davinci-002) with the question. This works pretty well. The concern is the number of tokens that this uses, feeding entire sections of the document to answer a single question. Is it possible, via finetunes, to get the model to remember the document? I did some testing and my results, again, were not great. But what I’d like is to put the sections into finetunes, and then say “based on section 2, answer this question.” where ‘section 2’ is arrived at by comparing question embeddings against my locally saved document embeddings, without having to send section 2 to the API every time a relevant question is asked. Will this work?

2 Likes

Sounds like you have created embeddings for each section of your doc, right?
That’s a vector search and you might have better results at no API costs by using sBERT or something.

I’m not aware of an effective way to teach GPT-3 the content of a particular document (or a set of documents) through fine tuning. At least my experiments to do so failed and I ended up with a vector search using sBerts sentence embeddings.
Check out Semantic Search — Sentence-Transformers documentation

1 Like

This is exactly what I’m doing. Thanks for the link!

I also came across the question recently.

@Multiman are you sure that GPT-3 can not be fine-tuned with the content of a particular document?
For example in the links below it is explained how to create a facutal Q&A bot out of a custom dataset. Which in this case would need to be created from the guideline document.

The doc you pointed to is not about fine tuning. It is the standard two step approach:

  1. retrieve relevant passages through vector search (a.k.a. semantic search through embeddings). The retriever extracts passages that are semantically similar to the questions.
  2. build a GPT3 prompt that contains the questions and the extracted passages.
    No fine tuning at all. Fine tuning refers to updating the weights of the LLM in favor of your particular domain.
4 Likes

Have you tried increasing the number of epochs in your fine-tune training of teaching the model about your documents? (Say from 4 to 10 or more) In theory this will “burn in” the details much more. It’s worth a shot, but may not be any more crisp than a vector similarity lookup.

Makes sense. Thanks. “Fine-tuning” through embeddings (as used in the title of the doc) is not real fine-tuning (updating the weights of the LLM).

You’ve fine tuned davinci-002?