Tool Calls placed inside content

I’m using tool calls as well as the JSON output format, and sometimes I get the tools calls response as a JSON inside the content.

Here is an example for what I mean:

[
  {
    "message": {
      "role": "assistant",
      "content": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_83MaAXFEu1LWTWeTqCLfV7PQ",
            "type": "function",
            "function": {
              "name": "google-search",
              "arguments": { "query": "site:volteo.com OpenShift" }
            }
          }
        ]
      }
    },
    "finish_reason": "stop",
    "index": 0
  }
]

It should be message > tool_calls; not message > content > tool_calls.

Hey there!

So, correct me if I’m wrong, but when i check the API reference your output looks like the resultant output it’s supposed to produce?

example query:

from openai import OpenAI
client = OpenAI()

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA",
          },
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
        },
        "required": ["location"],
      },
    }
  }
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
completion = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=messages,
  tools=tools,
  tool_choice="auto"
)

print(completion)

example response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699896916,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n\"location\": \"Boston, MA\"\n}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 82,
    "completion_tokens": 17,
    "total_tokens": 99
  }
}

from https://platform.openai.com/docs/api-reference/chat/create

I would say that you got a response that should never happen, and likely something else has parsed it. The assistant content should only be a string, or fail validation, and be impossible for the AI to output other objects, except by special invocation of sending to a tool.

If you can log the unaltered bytes returned from the api, that would make a convincing case that OpenAI wrote themselves a big bug in parsing.

I JSON parsed the string & formatted for better readibility.

I don’t think it’s a parsing bug, it happens a low % of time. What I was thinking was the model sometimes randomly spitting out the incorrect JSON.

not quite. In my example, I don’t have tool calls in the output, instead I have tool calls nested within content.

The model doesn’t know how to write the json framework of an API response or tool call. It has its own language for communicating with the API for tool recipients that you don’t see.

Your nest showing here:

"role": "assistant",
      "content": {
        "role": "assistant",
        "content": null,

This is showing some sort of parse error for what came out of the AI model. The API should never return to us a dictionary as assistant message, regardless of whatever the AI writes. That role:assistant appearing again could mean that it is OpenAI doing it - although that would result in more widespread reports of failure.

However, one would want to find out the exact bytes received from the API https request: what exactly was created and sent to the developer; not by any kind of object loading that happens after. Log that in concert with the further parse you do.

You can adapt your code for with_raw_response, which gives direct access to a httpx object as the return.

Ok, will try. I’m using the official node wrapper, not the python version though, which doesn’t have a with_raw_response afaik.

Do you know if there is a way to use the prompt + fingerprint (or something else) to recreate the API call? (in either python or raw HTTPS)?

The “seed” parameter, given some random number and then repeating that random number with another request, should make the token selection method similar on another identical run. Still not 100% the same because of the inference math that comes before.

The fingerprint changing from an update to models just tells you why the AI isn’t reproducing the same thing any more.