Strange response content using chat completions api

We are using gpt-4o on a conversational agent that should follow a step-by-step guide to create structured outputs. In the last few days sometimes, in the middle of the guide questions, the model is generating strange responses with no content, just tabs and new lines scape sequences. I can’t even describe or search for this kind of issue on the internet, and was unable to find a similar topic to follow the discussion.

Someone else having the same issue?

1 Like

I am facing the same issue here in a similar context :frowning:

It looks like you are not using json_schema as response format, or are not using a “strict”: true schema. This let’s the AI write whatever, and unless you have detailed “Repond only in the response format JSON schema given below” in system instructions (which you should do regardless as the schema is not placed with instructions), you get this symptom of the JSON format being trained on dumping out newlines and tabs because it was trained stupid or the JSON-mode logit enforcement was coded dumb to produce white space at your expense.

So its yet more weird than we thought. We are using json_schema as response format with strict true and this behavior is happening sometimes in the middle of chat interactions.

Are you sending it with json_object or json_schema?

response_format={
"type": "json_schema", 
"json_schema": {
  "name": "response_format_schema",
  "strict": True,
  "schema": {
    "type": "object",
    "properties": {

This is what turns on enforced structured output. It should be impossible for the AI then to start with anything other than a curly bracket of JSON.

Still I would add more to the system message or to the instructions of an assistant for how to use the mandated output.

Here for example, is exactly what is placed for a schema after instructions when using gpt-4o in assistants:

Image input capabilities: Enabled

# Response Formats

## response_format_schema

{"type":"object","properties":{"text_to_user":{"type":"string","description":"The response that the user will see."},"disposition_of_user":{"type":"string","description":"The mood expressed by the user.","enum":["neutral","positive","negative"]},"ai_was_helpful":{"type":"boolean","description":"Indicates if the AI provided a perfect solution to a problem."}}}

You are trained on data up to October 2023.

response_format_schema here in the placement is the top-level name provided within the JSON schema, and also can be guidance instead of a name that degrades the understanding. You can see it’s just kind of dumped there without clarity.


Here’s a better use of the 64 characters of name field for a schema…


Image input capabilities: Enabled

# Response Formats

## MANDATORY-_Your_output_is_sent_to_an_API_-_Follow_this_schema

{"type":"object","properties":

or: "name": "MANDATORY_-_Your_JSON_output_is_sent_to_API_-_Follow_the_schema"

(these are an assistant reproducing the actual text into its own schema…)


I discover the AI also has to produce this response name text, which you pay for as both input and output, like it was sending to a chosen function, despite that there is no other output that it can produce (and even the worst repetitive 64 tokens is followed). That “Response Formats” is plural means that OpenAI might have left future option for output to multiple responses.

So if you want to make it more compatible with non-strict, a name that the AI might be inclined to repeat could be more like "AI_JSON_response_to_API_parser"

1 Like

Got it. I’ll try the name suggestions that you gave me and should share here the results observed next week. Thank you.

Hi @leonardoavelino and welcome to the community.

Looks similar to Structured Outputs Infinite \n Newline Characters

2 Likes

Good news. The changes proposed by @_j and @platypus successfully fixed the issues. Thank you so much!

2 Likes