Correcting wrong answers via fine-tuning

Fine-tuning is not the appropriate tool for this job.

Fine-tuning influences model behaviour, not model knowledge.

Presently, the only way to augment model knowledge is with some type of RAG implementation.

Beyond that, 32 epochs of fine-tuning training is a lot. I would be absolutely shocked if you haven’t overfit.