have a prompt that extracts information such as the name, factory, etc., from a product sheet. While the prompt works well in most cases, for certain products, the OpenAI model either fails to extract the data correctly or makes mistakes.
To address this issue, I am considering performing fine-tuning. Would it be a good approach to build my fine-tuning dataset in JSON by using the incorrect responses generated by the model, manually correcting them, and then including them in the fine-tuning process?
Additionally, what best practices can be applied to minimize hallucinations in the API’s responses?
Fine tuning might help, but first I’d check how these hallucinations actually manifest themselves, and investigate whether it might just be a prompt and/or model selection issue.
Further, I’d check whether you’re actually getting hallucinations (model making stuff up) or just plain old mistakes where the model confuses things that are in the document.
I’d look at it like this: If you were to print out your prompt and snail-mail it to your grandmother without further instructions, would she be able to complete the task in a way that would satisfy you? If the answer is no, then I’d keep working on the prompt.
This might just be me, and there are other opinions out there, but I haven’t encountered a functional problem yet that couldn’t be resolved by refining the prompt and/or workflow that would have been better solved by fine tuning.