Gpt-4-0125-preview seems to have a 4k total token limit?

I am using JSON mode in gpt-4-0125-preview. I provide a system message (as the very first message in a series) which instructs the AI to generate JSON.

I get proper JSON back until I pass the 4k total token mark. When total_token goes over 4k, I get an endless whitespace response. I presume this is because the very first system message instructing the model to use JSON has rolled off of the context window.

When using JSON mode, always instruct the model to produce JSON via some message in the conversation, for example via your system message. If you don’t include an explicit instruction to generate JSON, the model may generate an unending stream of whitespace and the request may run continually until it reaches the token limit.

Here is an example of the completion object and then the string it returned (which I was trying to convert to JSON). Its just endless whitespace. But all previous calls where the total_tokens were under 4k resulted in proper JSON being returned.

I thought gpt-4-0125-preview support a larger total_token amount. Is it 4k?

{   "id": "chatcmpl-8wFPcWl8oZDaSMmBIjKkXIJzuAbeM",   "choices": [     {       "finish_reason": "stop",       "index": 0,       "logprobs": null,       "message": {         "content": "REDACTED",         "role": "assistant",
"function_call": null,         "tool_calls": null       }     }   ],   "created": 1708892960,   "model": "gpt-4-0125-preview",   "object": "chat.completion",   "system_fingerprint": "fp_89b1a570e1",   "usage": {     "completion_tokens": 600,
"prompt_tokens": 3486,     "total_tokens": 4086   } }
Error loading JSON from string: 










That’s an interesting result - we’ve seen similar stuff with whitespaces/newlines before, but typically with chinese input.

IMO json mode (and tool/function calling) is a gimmick of a bandaid of a solution - properly prompting the model will usually get you the same results without these weird bugs.

it does indeed - however, it’s possible that your schema instruction gets lost in the context. If you’d exhausted the token length you’d typically get an error message

I don’t know if this will solve your particular issue, but have you considered attaching the schema to the very end of your context?

It’s sometimes also a good idea to add a json field where the model can do some CoT or reasoning/disclaiming/whining stuff before giving you the actual answer.


I’m just spitballing here but it’s possible that you’re seeing the equivalent of the typical 0125 task refusal, just in json mode :laughing:

We can take a closer look at your prompt if you’re willing to post it.

Here is a script that will reproduce the issue. After total_tokens gets around 3k you should soon get a response back which is all newlines. This assumes your API key is available as an env variable.

from openai import OpenAI
import json

client = OpenAI()
model = "gpt-4-turbo-preview"

system_message_text = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit
amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco
laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim
id est laborum. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Lorem
ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit
amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et
dolore.
Reply in JSON format using this schema {'response' : your response}
"""


system_message = {"role": "system", "content": system_message_text}
user_message = {"role": "user", "content": "Give me a random fact."}
msg_array = [system_message, user_message]

while True:
    completion = client.chat.completions.create(model=model, response_format={"type": "json_object"},
                                                messages=msg_array)
    print (completion)
    ai_response = json.loads(completion.choices[0].message.content)["response"]
    msg_array.append({"role": "assistant", "content": ai_response})
    msg_array.append(user_message)

:|

What behavior would we expect here?

Well, up until around the 3k total_token mark, the AI responds with properly formatted JSON. Sometime after that, it will respond back with infinite newlines.

So the behavior I would expect is that it continues to respond with properly formatted JSON and not infinite newlines.