We’ve encountered an unusual issue with the gpt-4o-2024-08-06 model while generating structured JSON output. In some cases, the model starts repeating the last tokens indefinitely, causing the response to break. Here’s an example of what we’re seeing:
{
"topic_1": 8.0,
"topic_2": "The text starts correctly and then \u007f\u007f\u007f\u007f\u007f\u007f\u007f\u007f\u007f\u007f\u007f..."
}
To temporarily mitigate the issue, we switched to o3-mini, which does not exhibit this behavior. However, since this problem is on OpenAI’s side and not due to any mistake on our part, we would like to request a refund for the affected usage.
Where can we open a support ticket to speak with a real person about processing a refund for these errors? Any guidance on this would be appreciated.
Has anyone else experienced this issue? Any insights or potential fixes would also be helpful.
This is a typical issue and often associated with a bad prompting strategy.
I would say that it manifests if the model is trapped in a specific state (in terms of the JSON finite state machine) but really wants to do something else but isn’t allowed to escape.
Here’s some things you can check:
is your schema obvious and retrievable? If your schema is not obvious, or you have multiple schemas, or conflicting schemas, the models might struggle.
is your schema straight-forward? The models can handle some complexity, but a simple, flat schema is typically your best bet.
does your schema actually reflect the workflow? If your workflow asks the model to do something one way, but the schema forces it to approach the problem in another way, you might run into issues.
Here’s what I would do:
Disable JSON mode/structured outputs
work on your prompt until the model reliably returns the schema you expect
optionally add logit biases: negative for ["```"], positive for ["{","["], depending on your model
consider leaving JSON mode/structured output off - if schema validation fails, that would actually be desirable, because it implies that the model failed to understand or follow the schema
If you can achieve stability with this approach, you shouldn’t run into issues when you turn the restriction mode on.
TIP :
Consider adding something like “Your reply must be valid JSON, otherwise the system will break. Begin your response with {” to the very end of your prompt, the last thing the model will see before beginning its reply.
Hope this helps, good luck!
FUN FACT : \u007f seems to be the unicode control character for DELETE (Unicode Character 'DELETE' (U+007F)) - It looks like you managed to get the model into a state where it tries to backtrack on its progress. Very cool emergent behavior!