Is Anyone Getting Slow Response or Internal Server Error?

_j · October 19, 2025, 12:13am

Chat Completions and “service_tier”: “priority” as an API parameter. Thus GPT-5 without ‘Responses’ arbitrarily deciding to include or drop past reasoning items that were resent, to degrade cache:

input tokens: 13147	output tokens: 6116
uncached: 987	non-reasoning: 1188
cached: 12160	reasoning: 4928

HTTP 200 (48011 ms)

That’s 127 tokens-per-second, currently weekend evening or past bedtime for much of the world. Still a long wait looking at nothing, but that’s from the reasoning about the task. Far faster than 0-day model release, indicating ‘efficiencies were found’.

I’m only making little calls to gpt-5-mini otherwise to Responses to keep on top of and classify the Responses endpoint’s recent failures. That very topic should degrade your trust in the endpoint, until OpenAI says what’s going on or what was failing with their state persistence.

For usage like this, small server state, gpt-5-mini is still staring at a blank screen for longer than one would want before anything is seen - even when streaming. I don’t have benchmarks to report vs typical on Responses, other than typing minimal chats at it just now.

What would you like help with right now?
--Usage--  in/cached: 24/0;  out/reasoning:226/64

About six seconds to streaming:

--Usage--  in/cached: 201/0;  out/reasoning:388/192

They seem to have made sure that the “Your Health” new feature on the platform site doesn’t say anything bad, except for hard errors reported. So, if it dips, that does mean there’s an issue of significance.

However 500 errors are often stimulated by bad inputs; I would try replaying the exact same ‘chat’ if it doesn’t rely on a ‘conversations’ that has changed, and classify the success/fail ratio on re-sending that API call body.

Topic		Replies	Views
GPT-3.5 API is very slow. Any fix? API	31	10364	October 12, 2023
[GPT-3.5-Turbo] ‘The server is overloaded or not ready yet’ errors API chatgpt , api	10	9305	November 9, 2023
Status code 503: That model is currently overloaded with other requests API	32	42402	March 21, 2023
Continuous gpt3 api 500 error: The server had an error while processing your request. Sorry about that! API	60	29382	December 2, 2023
Ugh, nonstop 500 "error processing" for 50%+ of calls (solved) API api	4	1312	February 7, 2024

Is Anyone Getting Slow Response or Internal Server Error?

Related topics