I see Fine-tuning as a superficial response tuning, basically examples on how you expect the model will respond to a given input but it doesn’t really change the context at all and that will tend to generate hallucinations or inconsistent responses.
One approach to solve this is implement RAG + a fine tuned model to reduce over/under fitting issues as much as possible.
You need to convert your knowledge into embeddings, then store that embeddings into a vector db, query the db looking for similarities vs user input.
From there you can now construct the prompt to the fine-tuned model:
"based on the given context: {context} answer the question {question}"
You may want to customize that prompt according to your goal.