Best GPT response time for real-time applications

christophe.gouet · September 22, 2023, 12:50pm

Hi,

My team measured Davinci and Curie response time during the day to understand the occasional long response times we get in our application (tested with 1 token prompt):

We need an answer time under 2 seconds, so here are my questions:

Are these peaks due to queuing?
Is it possible to buy like “exclusive endpoints” or something to get a better response time and stability?

Thanks!

Foxalabs · September 22, 2023, 1:35pm

Hi and welcome to the Developer Forum!

You can indeed get a dedicated instance, they start to make commercial sense if you are using ~450M tokens per day, you can reach out to get more information here

Topic		Replies	Views
API call latency poses an issue API api	0	435	April 15, 2024
Discrepancy in Response Speed between GPT-3.5-turbo API and ChatGPT UI API gpt-35-turbo , chatgpt , api	4	2938	December 24, 2023
How to reduce OpenAI response time? API	13	17406	December 13, 2023
What is considered as normal latency? API	3	2688	December 15, 2023
Seeking Advises on Optimizing openAI API Calls Feedback api	2	543	November 16, 2023

Best GPT response time for real-time applications

Related topics