API service outage: Fine-tuning gpt-4o with image input fails randomly on Responses & CC

For the past 1 day, we’ve been getting incorrect responses from the responses API. The response keeps spamming an empty array:

{

       "id": "<id>",
       "type": "message",
       "status": "completed",
       "content": [
         {
           "type": "output_text",
           "annotations": [],
           "logprobs": [],
           "text": ""
         }
       ],
       "role": "assistant"
     },

While completely ignoring the specified json schema. The response also indicates no errors, but seems we’re getting billed for the 100k+ tokens it ends up using :)))

Anyone else experiencing this issue? I’m using a fine-tuned 4o model. Not every request has this death spiral though; some do go through correctly.

Edit: I am using image input

1 Like

Yep, Situation abnormal, same long-persisting system as happened weeks ago for weeks.

At top_p:0

  • no response for minutes
  • a response that has vision based text
  • a constant loop of “assistant”

also: complete random inconsistent responses when the model does respond - top_p:0 not working at all to normalize the output. Have to go with temp and top p: 0.01. To get the same amount of assistant message spam until max_output_tokens.


Oof. The entire thing also gets counted as input tokens, so I’m assuming there’s no equivalent param to max_output_tokens that I can use to stop it.

1 Like

The first thing I set, knowing that it would loop, was the maximum output. Useless.

This fault will incur a huge bill, unstoppable by max_output_tokens. Here, setting it to just above the minimum value accepted (which reports the same 35k input usage that would have been 301t for the small image otherwise):

Varying the ranges of max_output_tokens doesn’t vary the bill of the same amount. Nor does terminating a streaming output. I imagine if you sent a message with many non-tiny images, the potential cost would not be limited at all by the model context, like you show. It probably also can re-bill for the “deny images” system message of 200+ tokens that is exempted, that OpenAI attempts to make secret.

This massively violates the API promise of max_output_tokens being a billing limit.

To stop any support time-wasting and render “provide a request id” useless to delay action:

  • Empty response and nonstop output:
    x-request-id req_f48bdc55509442b6a033864f1d7aea8b,
    req_54a906c37e774230a84105c9b04f0caf
  • Empty response and terminates on one:
    x-request-id req_e2422e2c709e496582b656864bd0ddf1

The playground seems to be only demonstrating one empty assistant output when it happens about 50% of the time currently, yet even at max_output_tokens=16 and the tiniest of images for 282t input:

image


Chat Completions:

1 in 10 or so, exactly the same fault as before there too, instead of going loopy:

Request ID for the Chat Completions API returning 400:
x-request-id req_eafcb2667c344ec78350e71be2405bd3

The prior thread from two weeks ago with two weeks of inaction, then on gpt-4o, gpt-4.1 and gpt-4.1 mini fine tunes:

1 Like

@vb any chances of this getting fixed?

This issue continues with gpt-4o fine tuning models + image input.

Chat Completions (1/1):

Responses API (@ 4/4)

Another thing that happened was base 4o (gpt-4o-2024-08-06) spat my json schema def back at me and continued to give empty messages and new lines till it reached the output token limit of 5k.

Hey everyone, We believe this issue should be resolved now as our engineering team has deployed a fix. Please let us know, if you are still facing the issue. Thank you!

1 Like