How to include content and tool calls in one message?

It used to be easier to have content produced before a tool call. The models have been trained to do this less.

You can use the knowledge that the AI must produce the normal user response only first, and then emit the tool call in the same response, as a means to give the exact procedural steps of producing a response.

Any system instructions given to gpt-4o seem to fade quickly with extended chat.

If you simply need a high-quality progress that cannot be degraded, upon tool call, you might have a separate language model call that is given the user question and the AI function output. It could be instructed to write in the style of examples like “I am going to use my internal knowledge base search skills to help answer about giraffes”. This could even be initiated after a limited amount of streaming to get the tool, even placing the tool description for the progress AI’s understanding of what’s going on.

When you consider that writing language before emitting tool output is effectively delaying the production of that tool call, this seems like a way to get the completed response done faster while still giving user satisfaction.

2 Likes