Best way to get both conversational and json responses

Hi,

I am trying to understand how to develop something similar to this using the agents SDK.
Let’s imagine we have a system that allows the user to create html.
We want this system to keep replying to the user in a conversational format but also output the html for whatever the user described.

How should i go about doing this? I already have a prototype working but trying to figure out if there’s a better way.

Thanks,
Manuel

I don’t know how the SDK does it but my instinct would be to have the model respond entirely in JSON and include a parameter for the conversational text. Or have it output all the text and then run a function to submit its JSON data.

Consider outputting Markdown (only) and then using code to output HTML. The model is very good at markdown - HTML is much trickier but very easy to generate

1 Like

We would like to have the normal returns from the agent, including all the tools it called and it’s results and any messages it created for the user as the conversational aspect of it. We filter out the tools calls so we don’t show those to the user, just the messages the model generated that were meant for the user.

From the data the model gathers, it should then also create a json output so we can display it separately. a bit like what stitch from google does.
And no, i don’t want to create a competitor to that, just a very good example of it.

You can have the final product be an unavoidable structured output, with two keys html_page and response_to_user.

The AI will find ways to compress the quality if approaching its “how much I want to write” limits. So you still may not want to have a two-in-one output.

You might re-prompt after having the HTML code as a non-conversational deliverable, “produce a summary of the HTML and code techniques that you used to create this solution, and why it meets the needs.”

The technique depends on if this is an inescapable non-chatbot AI job.