How is response time on Azure-hosted OpenAI models vs OpenAI’s API?

How is the response time for OpenAI models hosted on Azure compared to OpenAI hosted?

We use OpenAI models at Instafill.ai to fill out complex PDF and Word forms, and we’re trying to reduce response time for our users. We’re exploring whether Azure OpenAI might be faster in certain regions, and also considering sending requests to both Azure and OpenAI in parallel and using the first response.

If anyone has real-world latency comparisons or experience with this parallel setup, I’d appreciate your insights.

We’re tier 5 partner of OpenAI.

1 Like