I have watched this very excellent video on Fine-Tuning vs. Embedding OpenAI Q&A: Finetuning GPT-3 vs Semantic Search - which to use, when, and why? - YouTube, as well as a lot of other posts and videos, and believe I have a good understanding of why, in my case, I definitely need to use embeddings.
However, one aspect of this subject I have not seen addressed is: Is there any advantage to using both?
Let’s say you have a dataset that you both embed and additionally use to fine-tune a model. A user inputs a query. You run a vector search against the embedded dataset and return the top_k hits. You then submit the query, along with the vector search text results as context, to your fine-tuned model as a prompt. Would the completion in this scenario be better than submitting the same prompt to a non fine-tuned model?
That’s a good video, and good YouTube channel, too.
And there are reasons to use both.
You may not realize it, but if you are using
text-davinci-003 or the ChatGPT API, then you are using both. Both of these models were fine-tuned by OpenAI to be more conversational.
Maybe this is a misconception on my part, but I see fine-tuning and embedding as contributing very different features to the chatbot experience.
OpenAI themselves suggest that embeddings are a far easier, quicker, cheaper and better way to add new information to their models, contrasting embedding as short-term last-minute learning with fine-tuning as long-term learning that emerges over time through the laborious training process. So any fine-tuning “we” users are able to do just tweaks a tiny part of the total neural net; it would be prohibitively expensive to retrain all of it.
Advocates of semantic search-ask embedding-based QA often observe - as do OpenAI - that fine-tuning is fallible as a source of information, and may forget, confuse and confabulate answers because of the diffuseness of the impact of fine-tuning on the weights in the neural net. It seems that the other side of this is seldom mentioned: that the integration fine-tuning makes possible also impacts the way new information may affect the creativity/temperature-related behaviour of the system, albeit only in a very localised way.
So I think that whereas embeddings are obviously better for strictly bounded tasks that seek to deliver answers to a tightly-defined range of questions, there remains a case for fine-tuning if we are interested in more unpredictable, creative and perhaps “thoughtful” answers.
What I’d be interested to hear about are theories about the extent to which the scope of semantic embedding also makes such broader scope possible through search-ask: can we enlarge the embedding vectors so much, and build so much content into them, that the context they embody starts to embrace the kinds of creativity we’ve hitherto had to rely on organic brains to provide?