Multiple image analysis using gpt-4o

Hey team,

I am building a mobile application using swift UI. I am trying to let the user upload 2 images, and send them to gpt at the same time to have them compared with each other… however I can’t get it to work.
Any advice?
Thanks in advance :smiley:

Welcome @matt.costello1996

Could you share what doesn’t work?

Thanks SPS. So essentially I have been able to send one image to my api and get a returned text analysis of an image. However, I would now like to send two images at the same time, and have them analysed/compared with each other. However I can’t get this to work.

I am current sending them a base64, however I am getting an error message that says “unexpected response format”. I’m tried reusing the size of the images etc, but haven’t had any luck.

Thanks in advance :smiley:

Are you using structured outputs or JSON mode?

To elaborate on sps’s question, the error is about the response format specified in your request, not the format of the messages you sent. You might want to double-check your changes between the one-image and two-image versions. Make sure you didn’t make any unrelated changes, for instance, or accidentally add the image to part of the response spec instead of the message parts list.

2 Likes

You can use Swift’s UIImagePickerController or PHPickerViewController to let users upload images. Once you have the images, convert them to base64 strings and send them to GPT as part of a JSON payload via an API. Ensure your GPT endpoint supports handling image comparison logic.

Why are the bots responding today?

You must construct the user message properly, simply extending with more listed image objects:

# A user message "content" is now an array of type objects instead of a string
user = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Produce image comparison analysis: " + user_text_input,
            },
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/png;base64,{base64_image_file1}"},
            },
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/png;base64,{base64_image_file2}"},
            },
            # additional text or image_url blocks
        ],
    }
]

Hey, thanks for the message. I have been using these two functions in order to take a photo or upload a photo. I’m just having trouble sending them correctly to my api to have the compared and the analysis send back (JSON)

Thanks for your reply! I’ve tried changing my logic to this format. I’m now getting an “error parsing response ”.

It would be considerably more convenient if you could kindly share the code for the API call where you are sending both images.

1 Like



Thanks in advance! :slight_smile: