Hello everyone,
I’ve been exploring various ways to enhance the functionality of my LLM-based systems, and I’m curious if there are any ways to incorporate images directly into system prompts. If not, why? And will it be possible in the future?
Hello everyone,
I’ve been exploring various ways to enhance the functionality of my LLM-based systems, and I’m curious if there are any ways to incorporate images directly into system prompts. If not, why? And will it be possible in the future?
Hello Yang, you can pass images if the model allows it. Read this part of the documentation:
https://platform.openai.com/docs/guides/vision
Hello! The linked vision guide just given does not provide any indication about which roles support images for vision.
Currently, only gpt-4-turbo-2024-04-09
allows an image to be placed in a “system” role message (and not its substitute, “developer”). It does not perform well, as the “system” role is not the one asking about images. Images there might only help to inform response examples you give, and only if this single AI model also meets your needs in this unique way.
I also would not rely on this being continuously available in the future as it has been purposefully disallowed in all AI models since.
Thanks! Do you mind expanding more on why is it “purposefully disallowed in all AI models since”?
OpenAI went out of their way to explain in the API error message:
Invalid ‘messages[0]’. Image URLs are only allowed for messages with role ‘user’, but this message with role ‘system’ contains an image URL."
There was feedback early on that images in system messages were all but ignored. The vision models were likely not post-trained to answer about images there.