It looks like only user messages are allowed to include images - not tool or system messages.
But actually it would be very helpful to be able to include images in tool outputs or system messages.
For example, imagine a tool that searches the Unsplash API for images. It would be handy if model could see those images.
And there are many situations where I’d like to insert images into the system message to help the model visualize things.
Are there any plans to allow images in system and tool messages? This would be super helpful and seems straightforward and safe!
Background on Typescript Types
(Proof that I did my homework.)
If you look at the Typescript types for system, assistant, and tool messages (ChatCompletionToolMessageParam, ChatCompletionSystemMessageParam, and ChatCompletionAssistantMessageParam) it looks like they all define content as a simple string, instead of the ChatCompletionContentPart type.
While some of this could be nothing, coincidental, or just lazy implementation, if I put my tinfoil hat on, it seems like we’re observing a (very concerning) trend here…
It was useful to change an assistant message to user, and continue with only the edited AI’s answer as your own input for quick token savings.
No more.
I need to hit my own playground UI from a year ago with some practical instead of exploitative updates to make it my new screenshot source. It does have a role toggle.
(Noticed that Google AI studio had the same per-message button layout idea as I wrote in May 2023.)
These design decisions are contrived, and make sense only if either:
explicitly disallowing images for assistant and system messages is a deliberate design decision
OpenAI doesn’t have a single capable software architect.
communication at OpenAI is so dysfunctional that the UI dev(s [I wouldn’t be surprised by neither it being a single person nor there twelve people working on the UI]) need to bend over backwards to accomodate these weird API specs.
now this seems a bit better. if it was draggable it would be sensible. (have we lost drag drop technology during the pandemic? it used to be everywhere)
Disallowing images is a deliberate design decision.
It used to be just “images cannot be sent in the first system message” if I recall correctly, but now you get blocked. Assistants lets assistant messages have images for their own internal purposes, just another case where you don’t get ChatGPT functionality on chat completions, and only can do that replay of an AI seeing something (like it produced it) if you want to play in a ChatGPT-like jail.
Facing the same issue. My use case is that I am making the LLM call a tool which takes a screenshot and then passing the output image to the LLM, so the role of the message is tool and the content is an image, I am getting the same error
Invalid ‘messages[1]’. Image URLs are only allowed for messages with role ‘user’, but this message with role ‘tool’ contains an image URL.
For example, if you get your assistant to create an image and share it with you, then ask the assistant something about the image, this is an instant fail!
I have situations where I’m automating the sharing of picture media via the same local assistant account, which again causes the completion request to fall over! Very silly!!
Please lift this restriction asap, it doesn’t make sense!