One way to make it more robust is to end your instruction with something like something like
Begin your response with {
and remove this:
NEVER add a JSON code block syntax to your response
it’s probably just confusing the model if it does anything.
You can also fortify the message like this:
You are part of a bigger system, and you must always respond with a pure JSON response. The system will break if you don’t.
I can’t test this right now, but there have been issues with logit_bias with vision in the past. You can try to add a -100 bias for the triple tick, (74694), you might get an error.
Finally, with the same caveat, you could try json mode.
all that said and done, sometimes the model just screws up. if you have a 20% error rate that manifests like this, and you mitigate it, you might have a 2% error that somehow manifests as something else. mitigate that, and you might get a 0.2% error. Sometimes you just need to catch and retry. Even json modes isn’t immune to screwups.