Improving Arabic writing of gpt-4 in chat completions using training or embeddings

I’ve used gpt-4 to build a small tool for students to practice conversation in Arabic combining google’s text to speech and speech recognition with gpt-4 chat completions, the challenge I’m facing that gpt-4 responses are not using diacritics properly in Arabic (which is understandable since the majority of the text on the web doesn’t have diacritics), being a publishing house we have large amounts of data of word documents containing the same text with and without diacritics, and I wanted your help to understand the best way to improve gpt-4 models answers using diacritics would embeddings help solve this problem and whats the best way to get started with it. and if not can a trained ada model be used for chat completions?