Gpt-4-turbo consistently faster on Azure than OpenAI direct

cuties06.roam · May 3, 2024, 8:15am

Hi,

Just FYI.

The gpt-4-turbo-2024-04-09 model on Azure is consistently up-to 2x faster than direct from OpenAI (in terms of completion tokens/s). Most likely due to lower utilization for now.

jr.2509 · May 3, 2024, 8:37am

Just to add to this statement. My own experience is that latency for consumption of OpenAI models via Azure can greatly differ by the region where the model is deployed.

Topic		Replies	Views
How to reduce response latency in Azure OpenAI GPT-3.5/GPT-4 API or find a better-performing model? Community gpt-4 , gpt-35-turbo , openai	1	71	April 22, 2025
API latency when backend is hosted on Azure? API api	7	4699	November 3, 2023
GPT-3.5 and GPT-4 API response time measurements - FYI API	19	36684	February 6, 2024
Gpt-4-0125-preview is slower than gpt-4-0613? Feedback gpt-4 , api	5	5554	January 30, 2024
Gpt-4-0125-preview INCREDIBLY slower than 3.5 turbo API	12	9553	July 22, 2024

Gpt-4-turbo consistently faster on Azure than OpenAI direct

Related topics