Is a high latency for a response with a prompt in other languages than English normal?

hotinno · October 26, 2023, 6:11am

Hi everyone,

I’ve been prompting the API with Chinese language (5 seconds on Avg with GPT-3.5) for a while and I’ve realised the latency is really high compared to English (2-3 seconds on average).

My issue is that I want the response to be in Chinese. How do you guys deal with this? I saw there is OpenAI on Azure but it is only for a select number of Entreprise clients.

b0zal · October 26, 2023, 8:57am

yes its normal, sometime I used indonesian language

_j · October 26, 2023, 9:07am

“Latency” is the wrong phrase. OpenAI also misuses it when they want to say “token generation rate” to us. Latency could be how long it takes for you to get the first token of a stream, though, including the time of network and loading of context.

What you are really talking about here is the perceived character and language production rate, how many lines of text are coming out of the AI per minute.

Chinese has a very high token consumption per character. Unlike English where a word can be a single token and be 10 characters, a Chinese glyph can require two tokens.

Advantage: Chinese has much more meaning per character though (turn 4000 characters of English into 1500 characters of Chinese to fit into the “custom instruction” box).

This means that even though the AI has the same token production rate, the appearance of streaming text seems slower for Chinese-based languages.

Topic		Replies	Views
Very slow response time with chatgpt-3.5 turbo model API API	17	10955	December 19, 2023
What is considered as normal latency? API	3	2688	December 15, 2023
API runtime for different languages API	0	430	May 3, 2023
Is translating complex human languages slow? API gpt-4	3	468	September 14, 2023
Unstable speed of gpt-3.5-turbo-16k API api , gpt-35-turbo-16k , performance	6	1082	January 9, 2024

Is a high latency for a response with a prompt in other languages than English normal?

Related topics