If it were, “we at OpenAI changed even more parameters on running AI models to make them cheaper”, or “we continued with more post-training on the model to achieve more of our internal goals”, or “we have a new model and are damaging the old ones to make the appearance of progress”, then it is unlikely to be “resolved”
OpenAI ‘stealths’ the “snapshot” models - they like to break live applications repeatedly with behavior changes, and then deny and say they never do that.
However, it may be that all reports are actually coming from Microsoft Azure developers.
Do you have similar issue to report? Are you on Azure like the previous message, where OpenAI can’t directly help? Can you document your input any better? For, currently, gpt-4o-2024-11-20 with a json_output on OpenAI will not obviously break on a simple non-production schema, and one would need to see what is initially being run for a year to replicate faults, and if it can be improved in input.
Test run
system message
You are an OpenAI support consultant.
# Responses
## final_response
### strict validation: true
output must always be enclosed in a compliant JSON following this schema;
output is sent to an API for display, not directly rendered as JSON;
the output format is invisible to the user - don't discuss it.
"schema": {
"type": "object",
"properties": {
"output_text": {
"type": "string",
"description": "The main output text generated to be seen."
},
"mood": {
"type": "string",
"description": "Briefly, sentiment and tone of user input."
}
},
"required": [
"output_text",
"mood"
],
"additionalProperties": false
}
}
Input: (check your own schema, AI…)
Output; top_p: 0.8
{
"output_text": "The response schema provided in the previous message appears to be valid JSON. However, I will address potential issues and quality concerns:\n\nValidation Issues:\n1. No Errors in JSON Validity: The schema adheres to proper JSON formatting rules, so there are no outright errors that would make it invalid JSON.\n\nQuality Issues for Expected Use-Case:\n1. Description Clarity: The description of the output_text field could be more detailed to specify the context or constraints for generating the main output text. For example, should it avoid technical jargon? Should it focus on brevity or detail?\n2. mood Description: The description of the mood field mentions "sentiment and tone," but it could specify the expected range of values or provide examples (e.g., "happy," "neutral," "angry"). This would help ensure consistency in its usage.\n3. Lack of Constraints: There are no constraints or formats specified for the output_text or mood fields. For example, specifying a maxLength for output_text or enumerating possible values for mood (e.g., a predefined set of tones) would enhance the schema’s robustness.\n4. Error Handling: The schema does not account for cases where fields might need default values or where data might be missing (e.g., if mood is undefined, what should happen?).\n5. Additional Context: It might be helpful to include an additional property or field to capture metadata, such as a timestamp, to contextualize the response further.\n\nOverall, while the schema is valid JSON, improving descriptions and adding constraints would enhance its quality for practical use.",
"mood": "analytical"
}