Hello OpenAI Community,
I’m building a programming tutor model using homework problems and solutions as training data. Despite experimenting with various training/validation splits (70/30, 80/20, 90/10) and testing on datasets of 100 and 240 data points, I’m facing overfitting issues. The validation loss stops improving at a certain point, indicating poor generalization.
I’ve also explored a wide array of hyperparameters without success in reducing validation loss.
Could anyone suggest strategies or adjustments to better address overfitting? Insights would be highly appreciated.
Thanks for your help!