Only use available tools, remove content

I use the tools / function calling api as part of a process to decide which data to inject. There are 4 different tool / function options. Sometimes the AI will not use a tool, which is ok, but it generates content which can take up to 4x longer to complete than if its just a tool call.

How can I tell it to not bother with content (I want to move on rather than hang for the time for the content to complete)? If I can’t stop this, is there a way to do it with streaming so I can detect there is no tool call before the content generates?

1 Like

Interesting!

Did you try asking for this explicitly in the prompt (system message)?

If that doesn’t work for some reason, maybe you can try moving the whole tool call to a less structured system message (describing the structure of your function calls there and ask to only receive a JSON back). This is potentially a bit of a hassle but I’d expect your requirement should work.

The AI will almost never write content text before emitting its tool call. Therefore, you can monitor the first chunks of data in a stream json, and if tool_call is empty for several and needed for the input to be satisfied, you can force the connection closed (client.close() in Python) and try a different technique.

If the use of tool is optional and depends on the user input, making the choice to close automatically at a particular part of pipelining a multi-turn task would be harder.

thanks for responses @ramn7 and @_j … my experience back in case someone else has similar goals:

  • putting in system message doesn’t help much, but actually appending an instruction to the end of the user message can help limit the content. The one that I used that is greatly speeds up things: “Just write ‘1’ please, i just wanted to ask the question.”

  • streaming is also promising - because like _j said the tool call comes first - actually the whole tool name is there in the very first chunk. so i thought this would be the fastest solution but when i measured it against the above it was about the same speed, which is surprising to me. since this method is a bit more complex, i went with the first method.

1 Like