Right to Left languages token count

I was doing some experiments with davinci3 for non-English languages, especially right-to-left languages like Arabic and I realized token count is much more than in other languages like Turkish or german.
I already know this model uses BPE tokenizer.
I wanted to ask if there is any workaround for this. especially for lowering the cost of completion and fine-tuning?
Note that I think translation is out of options since we lose some part of the context!
Thanks in advance
:slightly_smiling_face:

1 Like

Hi Saied
did you find any solution for this issue?
I am currently facing this problem while dealing with the API and would like to set the number of tokens to be equals to English.

Thanks