How can I use function calling with response format (structured output feature) for final response?

Hi everyone,

I’m developing a chatbot using GPT-4o and integrating tools to fetch external data for generating user responses.

With the introduction of the response_format feature, I’d like to produce responses in a specific JSON schema after making tool calls, rather than defaulting to plain text. However, the documentation doesn’t cover scenarios where both tool calls and the response_format parameter are used simultaneously.
https://platform.openai.com/docs/guides/structured-outputs/introduction

Is it feasible to combine tool calling with the response_format parameter?

Currently, my workaround involves creating an additional tool (response_format_tool) dedicated to formatting the response. I instruct the model to call this tool right before generating the final output. Once the model invokes response_format_tool, I use its function arguments as the final response and halt any further LLM calls.

Please let me know if you need any clarification.

Thank you.

3 Likes

Bump!

How is this not an issue for more people? Sending prompt with tools attached works fine and even if the tools are called properly, the problem arises once we send back the tool call results and the 500 server error is thrown back to us.

I read the documentation and somehow missed that these options are somehow conflicting. Doesn’t make sense, but I won’t bother arguing that. Just looking for a solution.

What I did attempt is to remove the “tools” array from request parameters, and each message by role “Tool” I replaced it with the assistant message and removed the “tool_call_id” so the chat history is not in a broken state.

Not sure if the “parallel_tool_calls” setting affected anything, but I had to remove it once I sent back the results for processing since the setting itself can exist in the request if tools are given.

How about you use the function itself to create a deterministic output for the user?

i.e. send the structured JSON directly to the user UI instead of sending the result to the LLM and expecting the LLM to retain the same output.

You can also add this to memory so it is included in subsequent history included in the prompt.

I believe that might solve the issue, but it would require you managing your own memory so that might rule out Assistant API and force you to use Chat Completions.

It would also require you to send something sensible to the LLM so it responds with something that makes sense alongside the structured output provided by the function call.

This would also avoid potential hallucinations during the LLM round trip.

I want each assistant message to follow the schema, so it can reliably track and refine data in the format I give it. Are these two options (response_format: json_schema & tools) mutually exclusive?

pastebin: T121a6pR

I am confused about the terminology, are functions the same as tools?

What I am trying to achieve is to have a stateful conversation where it refines his response based on previous messages in an iterative process, the last response I expect to be the final and finished version. The task is to perform research on a certain topic and do it in a way that collects relevant information that can be used to generate a blog article in another part of my application. The structured response is simply used to guide its progress and allow it to stay on its path while programmatically allowing me to intervene.

Take a look at Pastebin, I have my console output there. As a black sheep here I avoid using Python.

yes

This sounds a lot like Canvas btw, so something along these lines must be possible somehow.

In fact I asked ChatGPT (with Canvas) to create a JSON with specific properties with example values and it did, then I asked it to remove some of the examples… and it did. Try it :slight_smile:

1 Like

I’m adding an example with code to explain the scenario:

This is what I want to do:

Let’s say function get_order_details gives us order details for the given order_id and we can attach this tool/fucntion to LLM, and now LLM can answer questions related to order details.


# Define the get_order_details function
def get_order_details(order_id):
    # Mock implementation for the example
    return {
        "order_id": order_id,
        "status": "shipped",
        "items": [
            {"name": "Laptop", "quantity": 1, "price": 1000},
            {"name": "Mouse", "quantity": 2, "price": 20}
        ],
        "shipping_date": "2024-10-01"
    }

We can pass this function as a tool to openai client like this:

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools, # tool for order details function (JSON)
)

Example conversation:
User: Can you tell me order detail of order_id: 123
Tool call: get_order_details
Tool output: {…}
Assistant: You order was shipped on 2024-10-10 …

Here, the response is in plain text, and I want the get the response in a fixed JSON schema (OrderDetailsResponse) as shown below.

Assistant: {order details in JSON string}

# Define the response format using Pydantic
class(BaseModel):
    order_id: str
    status: str
    items: list[dict]
    shipping_date: str

In the structure output feature, we can provide the above schema class using the response_format parameter to get the final answer in JSON schema instead of plain text.

Here is an example from the documentation:

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent, # here
)

event = completion.choices[0].message.parsed

But this example does not use any use tool and I’m not able to find any such example that uses tools and response_format simultaneously.

For now, my workaround is using OrderDetailsResponse as a tool along with get_order_details and instructing the model (system prompt) to call this OrderDetailsResponse before generating the plain text output.

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools, # tool for order details function and OrderDetailsResponse
)

And once LLM calls this OrderDetailsResponse tool, I will have the arguments in JSON schema.

But this approach is a little bit unreliable as LLM needs to call the OrderDetailsResponse at last, otherwise I will not get the response in JSON schema.

Please let me know if you need any clarification.

Thank you.

Once a tool has been called by the LLM, its schema is no longer necessary to be included in the response payload. Replace it with the response_format. The sequence looks something like this:

4 Likes

@nicholishen There could be more than one tool call, so we don’t know when to omit the tool parameter.

We could do something like this:
if there is no tool call in the LLM response then call the LLM again with response_format parameter omitting tools parameter
This would cost us extra due to one more LLM call compared to my original workaround mentioned in my last message.

This is exactly the same issue we are facing. Since we are using REACT-like agents and agents of agents, we don’t always know when tools are done being called. It would be nice if there is a way of combining as the OP mentioned, tool calling (which ignores response format) and response format (for when the model is done using tools)

Tool calling is done when the AI outputs only in “content” instead of having any “tool_call” sent to you.

You can use both structured tools function specification and a structured enforced output response_format at the same time - on Chat Completions.