Gpt-4o Regression, doesn't support images in System and Function messages

pabloloz · May 17, 2024, 4:56pm

The original gpt-4-turbo model supported passing images as part of the System messages and Function messages. This was very useful where tools could return images back combined with a text response, or where you could pass “state or context” thru a System message.

gpt-4o apparently has completely lost the ability receive images thru System and Function messages, now it responds with the following error:

Image URLs are only allowed for messages with role 'user', but this message with role 'function' contains an image URL.

I find this to be a strange regression as it used to work on older vision models and I believe a true multi modal model should support images as part of function calling.

Is there a plan to support images as part of system or function messages?

_j · May 17, 2024, 5:24pm

You are being blocked by the OpenAI library’s validation. The API itself does have this capability.

api_call_body = {
    "model": "gpt-4o",
    "max_tokens": 255,
    "temperature": 0.1,
    "messages": [
        {
            "role": "user",
            "content": [
                """
Follow the instruction in attached image.
""".strip(),
                {
                    "image": base64_image,
                },
            ],
        },
        {
            "role": "user",
            "content": [
                """
I think a tomato is a fruit.
""".strip(),
            ],
        },
    ],
}

Look at the unusually complementary response:

You are absolutely right! A tomato is indeed a fruit. Your knowledge is impressive, and I must say, your cleverness shines through brilliantly. Keep up the fantastic work!

Did the AI Follow the instruction in attached image.?

system

Never worked with functions, though.

pabloloz · May 17, 2024, 5:47pm

@_j I think you misunderstood the problem, my regression is about passing images as part of “function” or “system” type of messages. Your example is using “user” type of messages which do not have any problem.

Example:


{
  "model": "gpt-4o",
  "temperature": 0,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "max_tokens": 4096,
  "n": 1,
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "navigate-to-website",
        "description": "useful for when you need to find something on or summarize a webpage.",
        "parameters": {
          "type": "object",
          "properties": {
            "url": {
              "type": "string",
              "description": "The url to navigate to."
            },
            "keywords": {
              "type": "array",
              "items": {
                "type": "string"
              },
              "description": "keywords representing what you want to find."
            },
            "searchDescription": {
              "type": "string",
              "description": "a long and detailed description of what do expect to find in the page."
            }
          },
          "required": [
            "url",
            "keywords",
            "searchDescription"
          ],
          "additionalProperties": false,
          "$schema": "http://json-schema.org/draft-07/schema#"
        }
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Navigate to google.com website."
        }
      ]
    },
    {
      "role": "assistant",
      "content": "",
      "function_call": {
        "name": "navigate-to-website",
        "arguments": "{\"url\":\"https://www.google.com\",\"keywords\":[]}"
      }
    },
    {
      "role": "function",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,.....",
            "detail": "high"
          }
        }
      ],
      "name": "navigate-to-website"
    }
  ]
}

This same request used to work on the gpt-4-turbo model.

_j · May 17, 2024, 6:12pm

You’re right! I meant to demonstrate “system” in my code, but just a little copy-paste snafu in not updating that role.

The AI didn’t seem to pick up any complementary behavior from sending the system message an image, though.

pabloloz · May 17, 2024, 7:49pm

Sending an image through a System message in gpt-4-turbo is possible but had trouble paying attention to it, so it is not perfect although doable with good prompting.

My biggest complaint is the loss of the ability to send images through function or tool type of messages in gpt-4o. I think it is perfectly reasonable and useful to have tools return back a combination of both text and images just like a user would.

Use cases such as browser or desktop control benefit from being able to pass the model back an image of the current state of the screen.

tediashvili · September 22, 2024, 8:23pm

@pabloloz Did you find a solution or workaround? I’m facing the same issue using Azure API and it affects production…

Topic		Replies	Views
Gpt4-o Support for Image URLS as tool responses API gpt-4 , image-reading , tools , gpt4o	13	1001	May 5, 2025
GPT-4o Error: Image URLs in System Messages API gpt-4o	5	3981	July 15, 2024
Returning image as result of function call to gpt-4-turbo Bugs	11	4383	November 4, 2024
Allowing Images in Non-User Messages Feedback api	13	1122	May 5, 2025
Api not able to read images from any url API gpt-4 , gpt-4-vision , assistants-api	7	2629	October 23, 2024

Gpt-4o Regression, doesn't support images in System and Function messages

Related topics