Bug with JSON file output

skyler-at-zd · January 11, 2024, 8:54pm

Using the Assistants API with Code Interpreter enabled, I have an assistant that regularly returns a properly-formatted JSON file for download as the end of the thread. The entire thread prompting works well, reliably returning accurate text in the correct format.

But 1 out of every 10-20 times, the JSON returned is formatted with single-quotes, rendering it invalid.

This is clearly a bug since the structure of the output file and information contained within appears correct, but the model/Code Interpreter is not following valid JSON specifications.

elmstedt · January 12, 2024, 12:26am

It’s not a bug I’m the traditional sense. The models are stochastic they sometimes predict the wrong tokens. That’s just how they work.

Try lowering the temperature or, better yet, just perform some post-processing on your outputs.

parallels · January 12, 2024, 3:45am

+1 to post processing. i was having an issue where occasionally it would nest the json into markdown blocks: json ...

i run it through a func to strip them.

the one i can’t figure out how to pp is the rare response that omits a single random comma

skyler-at-zd · January 12, 2024, 7:10pm

I disagree, the model is producing the content within the outputted file. Code Interpreter explicitly supports JSON as an input and output document file format: https://platform.openai.com/docs/assistants/tools/code-interpreter

elmstedt · January 12, 2024, 7:11pm

I’m sorry, can you please explain what you disagree with?

Diet · January 12, 2024, 7:24pm

Rather than decreasing temperature, it’s probably better to decrease top_p.

The model probably “knows” that ’ isn’t the best choice, but it’s still a choice. Top_p eliminates the possibility of it getting randomly picked if that is the actual problem. At least that’s my understanding of how it works. @_j is an expert at this if you need more help.

but if it’s a direct output from the code interpreter (i.e. you’re generating and saving a file) - then you may need to tell code interpreter what tool you want used. Maybe it’s a JSON5 thing, but this is just a wild guess. Do you have records of where it went wrong?

edit: reproduced it with json5.

so yeah, top_p around 0.8 or 0.9 (or maybe a prompt adjustment (or both)) might solve it

skyler-at-zd · January 16, 2024, 6:34pm

I am using the Assistants API which does not allow for setting the temperature, but does directly invoke the Code Interpreter tool and specifies file output.

Diet · January 16, 2024, 6:55pm

Do you still have the threads where the issue occurs? Do you still have the malformed documents?

What I’m saying is that you might need/want to specify the format or tool code intepreter is supposed to use to export your data.

szaslavsky · April 4, 2024, 5:11pm

@ elmstedt - setting the temperature on assistants is not supported yet. It’s only supported on completions.

skyler-at-zd · April 4, 2024, 5:30pm

@Diet @elmstedt

This issue is found using Assistants API which does not support what you’re describing. Code Interpreter explicitly supports output (and files) of certain file types & formats: https://platform.openai.com/docs/assistants/tools/code-interpreter

This is a bug any way you look at it.

I can however report that the issue was decreased in subsequent updates of the GPT-4-turbo model.

Topic		Replies	Views
Response has valid json but it's nested in broken json Bugs	16	1438	April 14, 2024
Returning an incorrect json response with single quotes in content API	12	15147	December 23, 2023
Malformed Function Calling JSON response Bugs	5	1129	December 18, 2023
HTML / JSON / Markdown Output Generation is Very Clunky or out right broken API api , html , json	14	5283	December 1, 2023
API vs non-API results are horribly inaccurate creating JSON objects API gpt-4 , api , code-interpreter , json	10	318	April 17, 2024

Bug with JSON file output

Related Topics