Since yesterday, my API latency has increased a lot, sometimes responses take 15–20 seconds.
I haven’t changed my code or model settings. Is this a known issue?
Would love to know if others are seeing the same thing or if there’s something I should adjust.
Which API? Which model? Your prompt? You have not provided enough information.
Latency can be expected depending on what you are doing, busy servers, etc.
A 15-20 second response could be considered fast depending on the use case. Complex requests can take a lot longer.
1 Like
yes very slow, both TTFT and tokens/sec