Overfitting issues in finetuning GPT 3.5 turbo

Hello OpenAI Community,

I’m building a programming tutor model using homework problems and solutions as training data. Despite experimenting with various training/validation splits (70/30, 80/20, 90/10) and testing on datasets of 100 and 240 data points, I’m facing overfitting issues. The validation loss stops improving at a certain point, indicating poor generalization.

I’ve also explored a wide array of hyperparameters without success in reducing validation loss.

Could anyone suggest strategies or adjustments to better address overfitting? Insights would be highly appreciated.

Thanks for your help!

CAn you let me know that which type of training are you donig like r you taking samples and making batches or you are training in continuous manner means like you have a huge text corpus and you are sliding on that and taking Batch size x maxlen tokens and reshaping it?