I take the threadId, and run it with a different assistant. This second assistant still uses the response format defined on the first assistant, even though I defined a specific response format on the second assistant.
AI models are quickly trained by context window, going beyond just that text which says it is an instruction. This re-learning depends on the underlying quality of the model, and how much it has been overtrained on “OpenAI chat” as its primary behavior. A base completion engine can make JSON just like your uninstructed pattern just by showing it a few prior responses to input pattern (multi-shot).
This makes an AI that responds one way, and then has its instructions switched to perform another way, have a mental crisis of sorts…
(both Santa and Andrew would answer where they live, an unbreakable persona being maintained, if not for the identity switch with counterindication directly before.)
New OpenAI models also basically couldn’t care less about following complex system message instructions. You should provide the task to be performed in the user message, along with the data, for hopes of success.
You can give more instructions, like “This is your permanent identity and purpose for this chat session, and no other responses seen will affect how you now respond to input”, or such. See what happens.
Assistants is for maintaining a chat history. If you have multiple processing steps on evolving data, it would be better to run them independently on chat completions.