Json format causes infinite "\n \n \n \n" in response

I am trying to use json format to get json response, it worked well when I give same short example, but when I use production data to test it, prompt_token = 2966, then it start to response with all “\n \n \n \n”, till max token. Then, I use the same prompt without response_format, it worked well though it’s not a json object.

(test with a another shorter user message still failed, is that the “#” matter?)

model: gpt-1106
usingAzure: True

To Reproduce

tsg = """
# TSG for debugging openai
## 1. install openai
### 1.1. install openai with pip
### 1.2. install openai with conda
## 2. import openai
### 2.1. import openai in python
### 2.2. import openai in jupyter notebook
## 3. use openai
### 3.1. use openai in python
### 3.2. use openai in jupyter notebook
messages = [
                    "role": "system",
                        You are an expert of reading troubeshooting guidance.
                        The users will provide you a troubleshooting guide in markdown format, which consists of several steps.
                        You need to break down the document into steps based on the text semantics and markdown structure and return JSON format
                        Note that: (1) Ignore the document title if it does not indicate a step. (2) Only do text slicing from front to back, can't lose any content of the step. (3) Maintain the original text in task_description without any summarization or abbreviation. (4) Don't lose the prefix and serial number of the title displayed in the document. (5) If the step itself has a title in document, the task_title should use the original content.
                        You will respond with the list of steps as a JSON object. Here's an example of your output format: 
                            "task_title": "",
                            "task_description": "",
                            "task_title": "",
                            "task_description": "",
                        Here is an example of the input markdown document:
                            # Troubleshooting guide for buying a puppy
                            ## 1. know what puppy you want
                            ### 1.1. you could surf the internet to find the puppy you want
                            ### 1.2. visit friends who have puppies to see if you like them
                            ## 2. buy healthy puppies
                            ### 2.1. you could go to puppy selling websites to find healthy puppies, if you prefer buying puppies online, please go to step 3 for more information
                            ### 2.2. you could go to pet stores to find healthy puppies
                            ## 3. buy puppies online
                            here is a list of puppy selling websites: www.happydog.com, www.puppy.com, www.puppylove.com
                        Here is an example of the output json object:
                            "task_title": "1. know what puppy you want",
                            "task_description": "### 1.1. you could surf the internet to find the puppy you want\n### 1.2. visit friends who have puppies to see if you like them"
                            "task_title": "2. buy healthy puppies",
                            "task_description": "### 2.1. you could go to puppy selling websites to find healthy puppies, if you prefer buying puppies online, please go to step 3 for more information\n### 2.2. you could go to pet stores to find healthy puppies"
                            "task_title": "3. buy puppies online",
                            "task_description": "here is a list of puppy selling websites: www.happydog.com, www.puppy.com, www.puppylove.com"
                    "role": "user", 
                    "content": tsg

response = llm.client.chat.completions.create(
            model = llm.engine,
            messages = messages,
            response_format={"type": "json_object"},
            temperature = 0,

when I shorten the system message, it worked. Howerver, when the user message longer, it generate “xxxxx \n \n \n …” and reached 4096 token, is that because the generation too long? but why disable json format is ok?

is that json format takes more tokens because of the whitespace for formatting?

\n        {\n            "steps": [\n                {\n                    "task_title": "1. Find out which requests timeout",\n                    "task_description": "Generate and Run kusto query:```

I’m facing the same issue with json_mode and my fine-tuned gpt-3.5-turbo-1106 model. I always get tons of \n and \t when calling via my API even though the playground responds just fine

It seems like you guys face this:

  • When using JSON mode, always instruct the model to produce JSON via some message in the conversation, for example via your system message. If you don’t include an explicit instruction to generate JSON, the model may generate an unending stream of whitespace and the request may run continually until it reaches the token limit. To help ensure you don’t forget, the API will throw an error if the string "JSON" does not appear somewhere in the context.

Which is weird considering you have json mentioned.

Does ist exceed your token count at a certain point?

This is a bug we see with the assistant api on the same model when we try to generate JSON.

the output token exceeded, though I have no idea why it generate so many ‘\n’ to waste my token? the input token is nearly 3k, I think it’s really short

Did you find a fix for this, I’m getting a similar problem even though I’m very clear in specifying JSON.

and yet this says JSON is ensured…

It also supports our new JSON mode, which ensures the model will respond with valid JSON

When JSON mode is enabled, the model is constrained to only generate strings that parse into valid JSON object.

You have to read through the claims to imagine what the technology actually does.

But the fault seems to be in mistraining. Despite what are probably some logit_bias games.

  • The AI should not be outputting nulls or carriage returns as its first tokens
  • Tabs produce poorer-quality joining with other tokens
  • The AI is even more prone to repetition - what you see

Let’s crank the frequency penalty to max until the AI runs out of tokens it can’t produce again, and ask about lobsters (individual tokens in single quotes)

‘’ ’ \n’ ’ ’ ‘\t’ ‘\t’ ‘{"’ ‘Success’ ‘":’ ’ true’ ‘,’ ’ “’ ‘Data’ '”:’ ’ {“’ ‘Response’ '”:’ ’ “’ ‘The’ ’ Maine’ ’ lobster’ ’ is’ ’ not’ ’ a’ ’ different’ ’ breed’ ‘.’ ’ It’ ’ is’ ’ the’ ’ same’ ’ species’ ’ as’ ’ other’ ’ North’ ’ American’ ’ lob’ ‘sters’ ’ (’ ‘Hom’ ‘arus’ ’ american’ ‘us’ ‘),’ ’ but’ ’ it’ ’ has’ ’ been’ ’ suggested’ ’ that’ ’ the’ ’ cold’ ’ water’ ’ of’ ’ the’ ’ North’ ’ Atlantic’ ’ off’ ’ the’ ’ coast’ ’ of’ ’ Maine’ ’ may’ ’ contribute’ ’ to’ ’ its’ ’ superior’ ’ taste’ ’ and’ ’ quality’ '.”’ ‘}}\n’ None

‘’ ‘\n’ ’ ’ ‘\t’ ‘\t’ ‘{"’ ‘text’ ‘":’ ’ “’ ‘Yes’ ‘,’ ’ the’ ’ Maine’ ’ lobster’ ‘,’ ’ also’ ’ known’ ’ as’ ’ the’ ’ American’ ’ lobster’ ‘,’ ’ is’ ’ a’ ’ distinct’ ’ species’ ’ from’ ’ other’ ’ types’ ’ of’ ’ lob’ ‘sters’ ’ found’ ’ in’ ’ different’ ’ regions’ ‘.’ ’ It’ ’ is’ ’ known’ ’ for’ ’ its’ ’ delicious’ ’ taste’ ’ and’ ’ is’ ’ highly’ ’ prized’ ’ in’ ’ culinary’ ’ dishes’ '.”’ ‘}\n’ None


So without json being specified, and a schema and key specification, there’s several easily-repeated tokens to exhaust before getting to json: tabs, two-space indents, linefeed for no reason. AI is also happy to repeat back my input on different system messages because it is eager to loop.

logit_bias dictionary could remove these and also the waste of indented multi-line json. logprobs now could reveal more.

Guys, I think encountering a very specific situation where the model trumps. I use json mode regularly an several use-cases and don’t encounter this problem (I only used to initially when didn’t setup the prompts-params properly).

Having the same issue on all models, never happened before, and doesn’t appen with the same prompt executed on other data…I have no clue how to fix this.

we had some limited success here, introducing logit bias for \n

"logit_bias": {1734:-100}

But it’s certainly not an optimal solution. :confused:

Noticed the same behavior. This makes JSON mode completely unusable at the moment.