OpenAI API takes too long to response

Hello OpenAI team,

Why the respontime for OpenAI API takes too long when compared to Playground.

I use the chat completion model, gpt-3.5-turbo-1106

Normally it takes about average 4 seconds to get afull response.
Is it normal?

Playground uses streaming response which means tokens are sent as they are generated by the model instead of waiting for the complete set of generated tokens.

1 Like

Response time depends on a lot of factors, including but not limited to Size of the generation, the time when you’re running in the operation.

IN playground, the streaming gives the essence of the response coming back to you token at a time while the whole chunk get’s returned in the API call. Thus, it might feel like it is slower but that should not be the case