I want to fine tune a LLM for a very specific use case (similar to Q&A) for a specific domain current LLMs (GPT3,4 etc) don’t know that much about. I got data that reprents general knowledge of the domain I’m working in&on and data that structure the specific input->output format and examples for my use case.
When it comes to fine-tuning GPT 3.5. What do you think makes more sense. First, train on domain knowledge in general for some time and after that train for the specific use case with the more limited data set or directly train everything in a single run, where the data set will be more diverse, like books, articles etc. containing general domain knowledge and my data set representing the specific use case with unique structure.
Thank you very much
Honestly, I dont think there is a specific school of thought here about which approach might be beneficial specifically for this use case.
Just cause you have a specific input/output format and examples, it would make sense to train first on the general domain knowledge and then onto the specifics so that the model has that understanding of the formatting.
Although for GPT, i do not think that you can fine tune a model which has already been fine tuned once before, so you might have to use a big single run
Yeah, currently you cannot finetune an already finetuned model. But you could fake it in that you do not train two different datasets for multiple epochs but unify in one dataset where the one dataset repeats 10 times and then the other 10 times and then training for only one epoch. But, I haven’t done that, yet, so it is more an idea.