Knowledge through fine tuning or RAG embedding

Just getting started with fine-tuning. Having used RAG embedding that is already giving not-bad results, wanted to try the fine-tuning side. With some basic Python was able to convert text into jsonl but the first results after training were below expectations… wrong answers, hallucinations galore… though at least hallucinations within the right context. Also Training loss of 1.8587 was not converging smoothly as expected. 67K tokens, 3 epochs, LR multiplier 2, about 40 long lines with each line in the following chat format (jsonl).

{“messages”: [{“role”: “system”, “content”: “…”}, {“role”: “user”, “content”:
“…”}, {“role”: “assistant”, “content”: “…”}]}

The values in the content areas are:

  • system content = always same content… "You are an AI assistant for…
  • user content = chunk of text from original text file
  • assistant content = next chunk of text from original text file

My challenge is to understand how to better build json content to match the LLM expectations and add knowledge from my original file (file of instructions).

Questions:

  1. Do I really need to come up with a question (user content) for each snippet of actual text I want the LLM to learn about (assistant content)? That seems to be a very tedious way to encode data, especially where some of the data in the text file are just CSV data. If needed any suggestions on making good questions for what is in fact text that is part of a whole?
  2. Would setting my system content to an actual prompt (like the one I am relatively happy with already with RAG) be of value?
  3. Is this fine-tuning path likely to increase the accuracy of my chat questions (which are already not bad with RAG/embedding)

Looking for additional guidance / suggestions / examples.

Welcome to the Forum!

The reason why you are not seeing good results with fine-tuning is that it is not intended to inject knowledge into the model. Even when providing question - answer pairs as part of your training data, the model will not pick these information up systematically during the fine-tuning process.

Therefore, your original approach of using RAG was the correct one.

You can further read up on strategies for optimizing the accuracy of model responses and the roles that RAG, fine-tuning and prompt engineering play in that regarding in this OpenAI guide.

Any further questions, let us know.

2 Likes

Thanks, this looks like a great document to know about.

1 Like