Images and files as function call outputs

We’ve updated function calling to support files and images as tool call outputs. You can now call functions like generate_chart or load_image and return those files back to the model, rather than just JSON or text. :shooting_star:

https://platform.openai.com/docs/guides/function-calling

3 Likes

Took me a while to figure it out.

In case anyone is wondering how to pass the output, it is documented here.

Basically, you pass it as an array of input types:

In a typical json function, this would be the function return:

{
    "type": "function_call_output",
    "call_id": item.call_id,
    "output": json.dumps({
                        "horoscope": "horoscope returned by the function"
                    }),
}

For example, in a base64 image output it becomes like this:

{
    "type": "function_call_output",
    "call_id": item.call_id,
    "output": [{
                 "type": "input_image",
                 "detail":"low",
                 "image_url": b64_png,
               }],
}
1 Like

In case someone was wondering.

We would still have to await OpenAI recognizing Chat Completions as a low-latency high-performance universal transparent endpoint in order to find image input in the API documentation there, and providing the feature-parity with image functions now in the developer’s control.

Which could look like:

from openai import OpenAI

client = OpenAI()

params = {
  "model": "gpt-5",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "create_image",
        "description": "Generate an image from a text prompt.",
        "parameters": {
          "type": "object",
          "properties": {
            "prompt": {
              "type": "string",
              "description": "Text description of the image to generate."
            },
            "size": {
              "type": "string",
              "description": "Image size in WxH format, e.g. 1024x1024",
              "enum": ["1536x1024", "1024x1024"]
            },
            "format": {
              "type": "string",
              "description": "Output format for the image.",
              "enum": ["png", "jpeg", "webp"]
            }
          },
          "required": ["prompt"]
        }
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Create an image of a serene mountain lake at sunrise with mist, ultra-realistic, cinematic lighting."
        }
      ]
    },
    {
      "role": "assistant",
      "content": "Generating your image now.",
      "tool_calls": [
        {
          "id": "call_imgtool_1234567890",
          "type": "function",
          "function": {
            "name": "create_image",
            "arguments": "{\"prompt\":\"serene mountain lake at sunrise with thin mist, ultra-realistic, cinematic lighting, golden hour reflections, 50mm photography, high detail\",\"size\":\"1024x1024\",\"format\":\"png\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_imgtool_1234567890",
      "content": [
        {
          "type": "text",
          "text": "Image successfully generated from prompt."
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "...ELIDED_BASE64_IMAGE_ELIDED...5ErkJggg=="
          }
        }
      ]
    }
  ],
  "max_tokens": 20000
}

# API Call - chat completions
response = client.chat.completions.create(**params)
print(response.choices[0].message.content)

If it existed in the API documentation.

A straightforward path that only needs images not impeded by a “role detector”

Can people confirm this worked in response API. I couldn’t get it working. When sending base64, its counting base64 as string instead of recognizing it as image.
Here is the payload I used.
[{'type': 'function_call_output', 'output': '[{"type": "input_image", "image_url": "data:image/png;base64, <base64 string used>", "detail": "low"}]', 'call_id': 'call_oI0mqpzzXT2eEDdyILm3dvuR'}]