{ "type": "json_object" } not always working

dierk.droth · December 25, 2024, 10:43am

I’m running a chat completion using the option { "type": "json_object" } against "model": "gpt-4o". Additionally, I have in my "role": "system" prompt a section which says "Your responses are in JSON format. Make sure that double quotes and newline characters within JSON property string values are properly escaped".
The whole prompt is about extracting a summary and contact data from documents.

However, I frequently get back responses which are not valid JSON format since e.g. double quotes or new lines are not escaped and thus corrupting the JSON format.

Is this is known problem with "gpt-4o"? Am I doing something wrong.

Diet · December 26, 2024, 2:14am

Welcome to the community!

I typically put the instruction (and the schema) at the very end of the prompt.

One thing I noticed is that not all models behave the same. “gpt-4o” refers to any version of the “gpt-4o” series, and I’d recommend sticking to a fixed, stable version that you name explicitly (e.g. “gpt-4o-2024-05-13”) (https://platform.openai.com/docs/models#gpt-4o)

However, I would say that in most cases it probably depends on your prompt, which might be confusing the model. If it’s not feasible to clean up your prompt (lack of time, experience, etc), you might perhaps be best served by using structured outputs instead? (https://platform.openai.com/docs/guides/structured-outputs) just a thought

There’s more stuff you can do, depending on how deep you want to get down the rabbit hole. Adjusting the prompt, adjusting the schema, tweaking the logit bias (https://platform.openai.com/docs/api-reference/chat/create#chat-create-logit_bias) etc, etc. But if you just want stuff to work fast, structured output might be the way to go for you.

CrazyRocks · December 26, 2024, 2:34am

Maybe try specifying the json format explicitly? like,

Your responses are in JSON format. Please follow the below format:
{"summary": "your summarized content"}

dierk.droth · December 26, 2024, 4:04am

Thanks for your feedback guys.

Every part/sentence of my (lengthy) prompt already is very specific and the prompt itself should be consistent.

So far, I did not see any value in trying a structured output, since the structure of the requested JSON output (JSON elements, sub-elements, arrays etc.) is correct. It’s just the OpenAI at times ‘forgets’ to escape double quotes and newlines in string property values.

sps · December 26, 2024, 5:39am

By simply lowering the temperature, prompting the model to generate “Valid JSON,” and describing the attributes and values, I have observed that these issues were resolved in my experience when generating with {“type”:”json_object”}.

dierk.droth · January 2, 2025, 11:02am

Thanks @all. I’ll look into the suggestions your provided…

Topic		Replies	Views
Valid json every time? Prompting	17	12419	January 3, 2024
Inconsistent and invalid JSON response API	8	6813	June 11, 2024
How do I ensure that JSON mode properly escapes quotation marks? API api , json , json-mode	5	6610	February 9, 2024
Fine tuning models to generate JSON response Prompting codex , chatgpt , fine-tuning , api	6	6361	November 9, 2023
Response_format=json_object returns invalid json with finish_reason=stop Bugs json-mode	7	954	January 7, 2025

{ "type": "json_object" } not always working

Related topics