In the training data for finetuning GPT for a use case where I want the bot to solve law case (provided by user) based on provided relevant laws (will be provided within the prompt, I am getting them using RAG), do I need to provide context (relevant laws) within the input (of each training data I/O) or just the user question (law case) would be enough?
And how many examples do I need do get it working properly?
And any tips about customising the finetuning parameters for my use-case?
Can you take a step back and explain what specifically you are looking to achieve with your fine-tuning project?
To your specific questions: yes, you would need to provide the context for each example.
How many examples really depends. In some cases 50-100 examples may be enough to get a decent fine-tuned model. In other cases, you may need to create 1,000+ examples for it to work properly. It’s common practice to start small and see where this takes you and then add more examples to further optimize it.
I want to finetune so it can solve law cases, in a specific format and I assume that finetuning it will also help it understand how exactly to go about solving a case, I mean it probably will understand the thought process behind solving a law case by finetuning).
Ok, got it. If the focus is on the how of solving a case, then a fine-tuned model should work.
I’d say the number of examples should then take into consideration the diversity of cases. So say the cases and the approach to solving them is fairly similar, then you should be able to get by with a smaller set of examples (e.g. 30-50 examples to start with). In contrast, if you have a larger diversity of cases, then you’d need to opt for more training examples to have an adequate representation for each type of approach in your data set in order for the model to pick up the pattern.
Understood, thanks, and for larger diversity cases, do I need to ensure that number of training examples for each type of case is very similar, or would it otherwise lean towards solving in the way for which we have more examples.
From my own experience, I’d try to keep the number somewhat similar. Like you wrote, otherwise you may run into the risk that it focuses too much on the most dominantly represented solution approach.
@zafarr - may I ask how you plan to do it? I mean, how are you gone build that model and train with the cases? I would like to do the same for the company I work for. Thanks!
If your company can provide you with a proper decent sized dataset of input and output that’s the best-case scenario, then you might not even need LLMs. As for LLMs imo the real issue is context issue that causes hallucination, which I don’t think finetuning can fix itself this issue, so I haven’t figured out what exactly to do to make it work with LLMs. But if your questions are relatively straight forward and won’t ever need to use 15k-20k tokens then LLMs would be fine, just use RAG for retrieving relevant docs and you’re good to go, for getting output in desired and for further improvement you can fine-tune LLM with 50-100 examples.