The first thing I’d do: see the output token count that is being returned, compare to the text that you get run through the o200k-harmony tokenizer (likely same as o200k_base).
Is there much more to be seen as output than you see?
Then log the wire transfer outside of a calling SDK, see the raw content being returned from the same call.
You have ellipsis - are there more JSON keys of structured output continuing after that (and are the keys to be written ones that should come before a message output that would enhance planning?)
If it is a single anomaly, the AI model itself with a tendency of closing the string when it has written a colon, you could provide in the output schema that producing colon (’ :', “:”) is forbidden in the JSON field.