If the 3% of your training data makes any difference at all, you can make a continuing fine tune model just on a small training file of identical input and corrected outputs as your training data, run that for twice the epochs of your original job (or more), and see if that fixes that domain of answering.