Design RAG system for machine translation

Hi,
I am working on machine translation project. The dataset is about cartoon dialogue. I have text in multiple language pairs. I tried gpt4o but the performance was not really good enough. I fine tuned gpt4o for 1 language pair for 4 epochs on 3k examples but the result was really worse compared to gpt4o. The problem is the model does not capture the specific tone.

I cannot fine tune for more epoch due to budget constraints. I am thinking how can I use RAG in this context? Should I embed all the dataset ?? Any suggestions? Papers?