Improving Arabic writing of gpt-4 in chat completions using training or embeddings

3asafeer · May 7, 2023, 3:49pm

I’ve used gpt-4 to build a small tool for students to practice conversation in Arabic combining google’s text to speech and speech recognition with gpt-4 chat completions, the challenge I’m facing that gpt-4 responses are not using diacritics properly in Arabic (which is understandable since the majority of the text on the web doesn’t have diacritics), being a publishing house we have large amounts of data of word documents containing the same text with and without diacritics, and I wanted your help to understand the best way to improve gpt-4 models answers using diacritics would embeddings help solve this problem and whats the best way to get started with it. and if not can a trained ada model be used for chat completions?

alex.petrescu · July 9, 2023, 1:37am

I have a very similar problem. Did you ever get any helpful responses or find ways to improve the responses with regards to diacritics?

abimbolaolawale41 · November 22, 2023, 9:03am

I am having the same issue with my work.

Please kindly let us know if you have been able to resolve this issue.

However, if I find a solution, I will post it here.

_j · November 22, 2023, 2:13pm

One technique you can use with AI models is by giving multi-shot examples of conversation style before the actual user input.

You may be able to shape the proper output by putting five or ten similar writing examples where typical user inputs are responded by the AI assistant with the exact text encoding and formatting that it should produce.

Topic		Replies	Views
Pseudo fine-tuning chat completions... best practices? Prompting gpt-4	4	1027	December 24, 2023
Fine-tune for a specific language spell check API	10	1802	March 2, 2023
Struggling with poor performance on fine-tuned davinci model API	15	2699	December 20, 2023
Fint-tuned model long responses and very slow responses API	10	1881	December 24, 2023
Can I combine Embeddings with Finetuning to develop a bot? API gpt-4 , chatgpt	3	943	December 24, 2023

Improving Arabic writing of gpt-4 in chat completions using training or embeddings

Related topics