Tool_choice = auto sending content and tool_calls

I am calling the API using tool_choice=auto, but I am getting a response back with both “content” and “tool_calls”. According to the docs the auto option “means the model can pick between generating a message or calling one or more tools.” I am expecting it to do one or the other, not both at the same time.

1 Like

You have the answer in diagram and more details in the link.


Tool choice

By default the model will determine when and how many tools to use. You can force specific behavior with the tool_choice parameter.

  1. Auto: (Default) Call zero, one, or multiple functions. tool_choice: "auto"

  2. Required: Call one or more functions. tool_choice: "required"

  3. Forced Function: Call exactly one specific function. tool_choice: {"type": "function", "name": "get_weather"}

https://platform.openai.com/docs/guides/function-calling?api-mode=responses#additional-configurations

That’s not really answering the question, though. I get that auto allows it to call functions (or not), but what I am not expecting is for it to actually generate content along with the tool_calls. The docs seem to imply that it’ll do one or the other. Thank you for taking the time to help me, though. I appreciate it.

1 Like

The AI model can indeed write to your user and then also emit a call to a tool in the same response. It typically takes a tool description requiring that pattern, but is certainly permitted and should be handled.

2 Likes

So, we believe this is indeed intended behavior? The docs could maybe be a bit more clear here about the effect of the “auto” setting. The wording seems to indicate that it’ll either call tools or generate content, but not both.

Yep. “auto” is also the default, meaning “doesn’t affect behavior.” Forcing a tool call isn’t a practical use unless you have specific managed turns that aren’t typical user “chat”.

Others request that appearance, so I show how to do:

1 Like

I tried turning on “required” with Spring AI and that caused an infinite loop, because it never got to a “stop” message (it always had to call a tool). Ouch! Spring handles all of the back and forth automatically with tool calls and only returns to the caller the final “stop” message. Thank you very much for taking the time to help me!

1 Like

Yep - also the logical behavior. Your code would/should know when it is sending back a tool response and then the API parameter releases that requirement when you don’t want to force more invocations (like making the AI produce multiple web searches or such).

Time to roll up my sleeves and stop using all the “magic”!

1 Like

Does the generated content contain the correct data from said toolCalls? And what are you calling, which endpoint, where you get the content and toolCalls at the same time?

There can be, for example:

  • Request (prompt)
    • ToolCall (action_required)
      • getLocation
      • getWeather
    • Completed (fetch)

Or

  • Request
    • ToolCall
      • getDate
      • getLocation
    • ToolCall
      • getWeather
    • Completed

It depends on the tools available and the instructions, where you can optionally define workflows, how tools can or should be called. I have managed to let assistants execute 60 tools in a single request (doing practically the same thing, but 60 times nonetheless).

Ouch. It’s better to have AI-powered tools than to have a chatbot that uses a full conversational context over and over again growing and paying for a chat history.

You’ve got a weather getter that needs multiple sources?

tool call → (function loads programmatic location, programmatic location search to geolocated results, AI chooses the likely intention, programmatic API call to weather, programmatic rewriting to natural language) → tool return.

I created a very rudimentary example (github org = callibrity and repo = tools-demo) for demo purposes:

    @Tool(description = "Add items to grocery list")
    public String addItemsToGroceryList(List<String> items) {
        log.info("Add items to grocery list: {}", items);
        return "success";
    }

    @Tool(description = "List all dietary preferences")
    public String listAllDietaryPreferences() {
        return "I am on the mediterranean diet.";
    }

    @Tool(description = "List all available ingredients in the kitchen")
    public String listAvailableIngredients() {
        return "We have tomatoes, onions, garlic, spinach, chicken breast, and rice available.";
    }

    @Tool(description = "List all equipment available in the kitchen")
    public String listEquipmentAvailable() {
        return "We have a blender, microwave, oven, and a coffee maker available.";
    }

    @Tool(description = "List all food allergies")
    public String listFoodAllergies() {
        return "I am allergic to soy and shellfish.";
    }

The responses are incorporating the information I’ve provided in these tools correctly. It’s all pretty darn slick!

Yeah, sure, but i wanted to see, what’s possible, how far i can push it… and it created 60 different logs, simulating failed interactions or information gaps…

Not really, that was an example, what to expect, when using toolCalls… i use PHP, no SDK, i’ve to handle everything on my own. Go on, tell me, why that’s also wrong, can’t wait to read :wink:

Ok, so no issue, but a feature. You could log that data, for analysis purposes, for example. I use the API via PHP and regular endpoints, where everything is scattered in chunks over many requests.

Then the slicky tricky part is deciding if it is just cheaper to always place that information in AI context. You’ve got a 10,000 token long conversation? Well, a tool call means you pay for that as input twice.

Like, “I want a tool to get the time”. Dude! The function specification for the AI is going to use more tokens than the time itself always being in a system message would.

Would it be twice or 3 times? i.e.

  • Base Model takes full input and choses to use tool
  • Tool Model takes full input to work out response - Does this always happen? It seems to happen with the search tool but that may be a special case?
  • Base Model takes full input plus tool response and continues

Maybe they eat the cost of one of these since it can be reliably ‘query’ cached.

There’s no “tool” model with functions. You have to run the API twice:

  1. assistant - I call your function
  2. assistant - I produce user answer from the added tool return.

That is, unless the output continues to be for more function call usage.