Streaming structured output tool calls with additional message content

The primary objective of structured outputs is to adhere to the JSON schema specification. This is accomplished by constraining the sample space of tokens, based on the specific part of the schema that the tokens are being generated for.

In scenarios where you desire the model to also generate a plaintext message to the user in addition to the structured output, it would be beneficial to include an additional parameter, such as message, thinking, or reasoning, of type string. Then have the model generate for this parameter aligning with your expected output within the content attribute.

Here’s an example from the same blog for showing reasoning step for solving a mathematical problem.

1 Like