Inconsistent / Slow API Times This Past Week

We use gpt-4.1-mini and gpt-4o-mini and gpt-4.1-nano for an application (in an ABC test currently). We have a 5s timeout on the API call due to timing requirements for the user, and have generally short prompts that are not dynamic except for a bit at the end based on user input. I also pass service_tier=”priority” with my requests. The average response time with this setup in “normal” operation is 500-700ms and we typically never get timeouts. In the last 5-7 days, we’ve been getting many intermittent timeouts, so it appears there is some degradation in performance in the API as it’s certainly not because the input length is too long/growing.

Is this a known issue, even for the priority service tier? I’m guessing you guys are bringing some new models online and that’s impacting capacity, but this is making the product unreliable for my use case.

2 Likes