Using the Assistants API with Code Interpreter enabled, I have an assistant that regularly returns a properly-formatted JSON file for download as the end of the thread. The entire thread prompting works well, reliably returning accurate text in the correct format.
But 1 out of every 10-20 times, the JSON returned is formatted with single-quotes, rendering it invalid.
This is clearly a bug since the structure of the output file and information contained within appears correct, but the model/Code Interpreter is not following valid JSON specifications.
It’s not a bug I’m the traditional sense. The models are stochastic they sometimes predict the wrong tokens. That’s just how they work.
Try lowering the temperature or, better yet, just perform some post-processing on your outputs.
+1 to post processing. i was having an issue where occasionally it would nest the json into markdown blocks:
i run it through a func to strip them.
the one i can’t figure out how to pp is the rare response that omits a single random comma
I disagree, the model is producing the content within the outputted file. Code Interpreter explicitly supports JSON as an input and output document file format: https://platform.openai.com/docs/assistants/tools/code-interpreter
I’m sorry, can you please explain what you disagree with?
Rather than decreasing temperature, it’s probably better to decrease top_p.
The model probably “knows” that ’ isn’t the best choice, but it’s still a choice. Top_p eliminates the possibility of it getting randomly picked if that is the actual problem. At least that’s my understanding of how it works. @_j is an expert at this if you need more help.
but if it’s a direct output from the code interpreter (i.e. you’re generating and saving a file) - then you may need to tell code interpreter what tool you want used. Maybe it’s a JSON5 thing, but this is just a wild guess. Do you have records of where it went wrong?
edit: reproduced it with json5.
so yeah, top_p around 0.8 or 0.9 (or maybe a prompt adjustment (or both)) might solve it
I am using the Assistants API which does not allow for setting the temperature, but does directly invoke the Code Interpreter tool and specifies file output.
Do you still have the threads where the issue occurs? Do you still have the malformed documents?
What I’m saying is that you might need/want to specify the format or tool code intepreter is supposed to use to export your data.