Hi,
Just FYI.
The gpt-4-turbo-2024-04-09 model on Azure is consistently up-to 2x faster than direct from OpenAI (in terms of completion tokens/s). Most likely due to lower utilization for now.
Hi,
Just FYI.
The gpt-4-turbo-2024-04-09 model on Azure is consistently up-to 2x faster than direct from OpenAI (in terms of completion tokens/s). Most likely due to lower utilization for now.
Just to add to this statement. My own experience is that latency for consumption of OpenAI models via Azure can greatly differ by the region where the model is deployed.