Files and Function Calling do not work together

When calling the API with files and functions, the API seems to completely ignore the tool call/result this is an example curl

curl -X POST "https://api.openai.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d @-  <<EOF
  {
  "model": "gpt-4.1",
  "messages": [
    {
      "role": "system",
      "content": "You are cool"
    },
    {
      "role": "user",
      "content": [
        {
          "text": "{\"task\":\"ask me for the file\",\"manuallyStarted\":true}",
          "type": "text"
        }
      ]
    },
    {
      "role": "assistant",
      "content": [],
      "tool_calls": [
        {
          "id": "call_random_id",
          "type": "function",
          "function": {
            "name": "ask",
            "arguments": "{\"message\":\"Could you please provide the file that you would like me to work with or analyze?\",\"reasoning\":\"1. GOAL: Asking for the file that needs to be processed or analyzed. 2. CONTEXT: I need the file as part of the task to proceed with the intended operations.\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": [
        {
          "text": "what is this",
          "type": "text"
        },
        {
          "text": "The file is located at the following path: \"/uploads/ticket.pdf\". You can use this path to refer to the file in tool calls.",
          "type": "text"
        }
      ],
      "tool_call_id": "call_random_id"
    },
    {
      "role": "user",
      "content": [
        {
          "type": "file",
          "file": {
            "filename": "ticket.pdf",
            "file_data": "data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iago8PC9UeXBlIC9DYXRhbG9nCi9QYWdlcyAyIDAgUgo+PgplbmRvYmoK MiAwIG9iago8PC9UeXBlIC9QYWdlcwovS2lkcyBbMyAwIFJdCi9Db3VudCAxCj4+CmVuZG9iagoz IDAgb2JqCjw8L1R5cGUgL1BhZ2UKL1BhcmVudCAyIDAgUgovTWVkaWFCb3ggWzAgMCA1OTUgODQy XQovQ29udGVudHMgNSAwIFIKL1Jlc291cmNlcyA8PC9Qcm9jU2V0IFsvUERGIC9UZXh0XQovRm9u dCA8PC9GMSA0IDAgUj4+Cj4+Cj4+CmVuZG9iago0IDAgb2JqCjw8L1R5cGUgL0ZvbnQKL1N1YnR5 cGUgL1R5cGUxCi9OYW1lIC9GMQovQmFzZUZvbnQgL0hlbHZldGljYQovRW5jb2RpbmcgL01hY1Jv bWFuRW5jb2RpbmcKPj4KZW5kb2JqCjUgMCBvYmoKPDwvTGVuZ3RoIDUzCj4+CnN0cmVhbQpCVAov RjEgMjAgVGYKMjIwIDQwMCBUZAooRHVtbXkgUERGKSBUagpFVAplbmRzdHJlYW0KZW5kb2JqCnhy ZWYKMCA2CjAwMDAwMDAwMDAgNjU1MzUgZgowMDAwMDAwMDA5IDAwMDAwIG4KMDAwMDAwMDA2MyAw MDAwMCBuCjAwMDAwMDAxMjQgMDAwMDAgbgowMDAwMDAwMjc3IDAwMDAwIG4KMDAwMDAwMDM5MiAw MDAwMCBuCnRyYWlsZXIKPDwvU2l6ZSA2Ci9Sb290IDEgMCBSCj4+CnN0YXJ0eHJlZgo0OTUKJSVF T0YK"
          }
        }
      ]
    }
  ],
  "max_tokens": 16384,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "notify",
        "parameters": {
          "type": "object",
          "required": [
            "message",
            "reasoning"
          ],
          "properties": {
            "message": {
              "type": "string",
              "description": "The message to notify the human user with. This is a one-way notification; no response is expected."
            },
            "reasoning": {
              "type": "string",
              "description": "Structured reasoning about this tool usage with this exact format:\n1. GOAL: What specific outcome are you trying to achieve? Should be concise and to the point. in the present tense. Creating, Adding, Updating, etc.\n2. CONTEXT: What relevant information or state are you working with?"
            }
          }
        },
        "description": "Send a one-way notification or informational message to a human user. No response is expected."
      }
    },
    {
      "type": "function",
      "function": {
        "name": "ask",
        "parameters": {
          "type": "object",
          "required": [
            "message",
            "reasoning"
          ],
          "properties": {
            "message": {
              "type": "string",
              "description": "The question to ask the human user. The tool will wait for a response before continuing."
            },
            "reasoning": {
              "type": "string",
              "description": "Structured reasoning about this tool usage with this exact format:\n1. GOAL: What specific outcome are you trying to achieve? Should be concise and to the point. in the present tense. Creating, Adding, Updating, etc.\n2. CONTEXT: What relevant information or state are you working with?"
            }
          }
        },
        "description": "Prompt a human user with a question and wait for their response before continuing."
      }
    },
    {
      "type": "function",
      "function": {
        "name": "done",
        "parameters": {
          "type": "object",
          "required": [
            "result"
          ],
          "properties": {
            "result": {
              "type": "object",
              "properties": {
                "json": {
                  "type": "object",
                  "description": "A field for submitting a JSON object result when signaling successful completion.\nIf the result to submit is a JSON object, include it as a JSON object, NOT as stringified and escaped JSON."
                },
                "error": {
                  "type": "string",
                  "description": "If it is impossible to create a successful result to return, you can instead use this field to report why it was impossible to continue."
                }
              },
              "description": "A JSON object containing the result when signaling successful completion"
            }
          }
        },
        "description": "Signal that you have completed your task."
      }
    }
  ],
  "tool_choice": "required"
}

The response is

{
  "error": {
    "message": "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.",
    "type": "invalid_request_error",
    "param": "messages.[4].role",
    "code": null
  }
}

If I remove the user message that contains the file the API returns a completion.
I have confirmed this behavior while using the Files API as well. If you remove the tool calls/response the model can understand the file. If you remove the user message with the file the model returns a completion. So I know that the tool call/response is setup correctly

Is anyone else facing this, any suggested fixes?

You are putting a new user message in after the tool return, without allowing the AI to respond to the result returned from the function.

Your functions are also complete nonsense fitting no pattern of use that is logical. There is no point to a chatbot that takes a user question and then would find any use for an “ask a question” function, and a function is not used for producing a final output. A function return that both asks and answers a question?

Then your useless system message instruction “you are cool” is only showing that you don’t put any effort into making a working product.

Perhaps you should explain even what pattern you are trying to produce here.

1 Like

yeah. You need to step back here a bit and consider what a function is best used for.

if you want the bot to prompt a user to do something, then give it that instruction in a system/developer prompt (I would not do this in the user message btw, but it might work albeit with less ‘priority’). There is no need to create a function to do this.

once the user has done something, you can ensure there’s a function to handle the user having done something.

your “notify” function? what’s the purpose of this? the LLM will in any case respond to any queries. If you have a long running task that is best left to local code to run that task. Once that task has finished you could update the UI deterministically with normal code, you do not need the bot to notify the user, but if you do want the bot to speak you can simply send another message to the bot and handle the response as normal.

1 Like

functions need well described and distinct purposes for the LLM to call them, and they need to fulfill a need (for the LLM to solve the request).

notification can be done without the LLM, but if you want the bot to say something you can (in the background) send the bot a request to say something to the user (a request you don’t show to the user). You don’t need a function.

use the LLM as a language processor and parser, not for functionality you can deliver in a normal way.

what is the LLM going to do with this out of interest?

1 Like