What parameters should I use in epoc, batch_size and learning rate for training model based on question and answers pair? and How many samples should be sufficient?
What API are you using?
I am using the fine-tuning mechanism as mentioned in the documentation.
Update:
I am using Davinci 003 for fine tuning
1 Like
I think he meant which model… ie Davinci, etc.
As far as I know, epoch, batch_size and learning rate should be automatically set for best performance. I believe OpenAI recommends at least 200 examples, though the more the better. If you fine-tune with 1000+ examples and still aren’t getting good results, that might be the time to tinker with batch_size, epochs, etc.
From my GPT-2 experience, epochs are important because too much fine-tuning can result in overfitting which means the language model repeats verbatim from the dataset rather than new content.
Hope this helps!