Gpt-4-turbo consistently faster on Azure than OpenAI direct

Hi,

Just FYI.

The gpt-4-turbo-2024-04-09 model on Azure is consistently up-to 2x faster than direct from OpenAI (in terms of completion tokens/s). Most likely due to lower utilization for now.

1 Like

Just to add to this statement. My own experience is that latency for consumption of OpenAI models via Azure can greatly differ by the region where the model is deployed.

1 Like