Welcome to the Forum!
The reason why you are not seeing good results with fine-tuning is that it is not intended to inject knowledge into the model. Even when providing question - answer pairs as part of your training data, the model will not pick these information up systematically during the fine-tuning process.
Therefore, your original approach of using RAG was the correct one.
You can further read up on strategies for optimizing the accuracy of model responses and the roles that RAG, fine-tuning and prompt engineering play in that regarding in this OpenAI guide.
Any further questions, let us know.