For the past 1 day, we’ve been getting incorrect responses from the responses API. The response keeps spamming an empty array:
{
"id": "<id>",
"type": "message",
"status": "completed",
"content": [
{
"type": "output_text",
"annotations": [],
"logprobs": [],
"text": ""
}
],
"role": "assistant"
},
While completely ignoring the specified json schema. The response also indicates no errors, but seems we’re getting billed for the 100k+ tokens it ends up using :)))
Anyone else experiencing this issue? I’m using a fine-tuned 4o model. Not every request has this death spiral though; some do go through correctly.
Yep, Situation abnormal, same long-persisting system as happened weeks ago for weeks.
At top_p:0
no response for minutes
a response that has vision based text
a constant loop of “assistant”
also: complete random inconsistent responses when the model does respond - top_p:0 not working at all to normalize the output. Have to go with temp and top p: 0.01. To get the same amount of assistant message spam until max_output_tokens.
The first thing I set, knowing that it would loop, was the maximum output. Useless.
This fault will incur a huge bill, unstoppable by max_output_tokens. Here, setting it to just above the minimum value accepted (which reports the same 35k input usage that would have been 301t for the small image otherwise):
Varying the ranges of max_output_tokens doesn’t vary the bill of the same amount. Nor does terminating a streaming output. I imagine if you sent a message with many non-tiny images, the potential cost would not be limited at all by the model context, like you show. It probably also can re-bill for the “deny images” system message of 200+ tokens that is exempted, that OpenAI attempts to make secret.
This massively violates the API promise of max_output_tokens being a billing limit.
To stop any support time-wasting and render “provide a request id” useless to delay action:
Empty response and nonstop output: x-request-id req_f48bdc55509442b6a033864f1d7aea8b, req_54a906c37e774230a84105c9b04f0caf
Empty response and terminates on one: x-request-id req_e2422e2c709e496582b656864bd0ddf1
The playground seems to be only demonstrating one empty assistant output when it happens about 50% of the time currently, yet even at max_output_tokens=16 and the tiniest of images for 282t input:
Chat Completions:
1 in 10 or so, exactly the same fault as before there too, instead of going loopy:
Another thing that happened was base 4o (gpt-4o-2024-08-06) spat my json schema def back at me and continued to give empty messages and new lines till it reached the output token limit of 5k.
Hey everyone, We believe this issue should be resolved now as our engineering team has deployed a fix. Please let us know, if you are still facing the issue. Thank you!