GPT-4o Audio + Tool calls = API Bug

Actually if GPT include an audio with it’s tools_call the API will throw an error when sending the tool result.

The problem is when we have an assistant message like that (Tools call + Audio response)

    {
      "role": "assistant",
      "tool_calls": [
        {
          "id": "call_oHHuf6tqsySDjE6KIRSK1TvR",
          "type": "function",
          "function": {
            "name": "create_poll",
            "arguments": "{\"question\":\"Pain au chocolat ou beurre ?\",\"choices\":[\"Pain au chocolat\",\"Pain au beurre\"],\"duration\":300}"
          }
        }
      ],
      "audio": {
        "id": "audio_677a61d5d9d08190b1987ff1c1326c73"
      }
    },

The API will just ignore tool_ calls and will throw an error as we don’t provide it:

Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.

The bug is on the playground also:

1 Like

Found the same issue. only workaround for the moment is to literally exclude the audio from the saved message if tool_calls are present, but is obviously suboptimal