Image comparison - Working with Chat BUT Failing with API

raphael5 · October 29, 2024, 10:05am

Hi everyone,

I’m facing a little challenge. My goal is simple, I want to upload two images & ask GPT to determine if they are identical / almost identical or different.

I’ve tried my approach in the chat first, using the prompt at the end & got satisfactory results.

Then I tried to automate the same approach using the function at the end. However, the API keeps responding that it cannot analyze or compare images.

I have noticed that when simplifying my prompt to its minimum requirement like asking a single question such as “Look at the images, are they identical” it sometimes answers (but unfortunately this prompt is too limited to be efficient for my use case). But everytime I try to ask for more details, it fails again.

Not sure what is causing the issue here, any recommendation or better approach to compare images with high level of accuracy?

Thanks for the help

    **SEE / LOOK** at the 2 screenshots attached, they contain several images. 
    Identify duplicated small images using a structured approach:  

    
    1) -Check major aspects of EACH image
       -Characters, items & physical elements, gestures & actions, stance, background & environment, scene angle, etc. (anything relevant) 
    
    2)	If no clear difference: 
        -Find a deeper level of detail & more aspects to analyze
    
    3)	If still no clear difference: 
        -Note: minor adjustments are NOT considered as actual differences 
        -Minor adjustments such as slight cropping, minimal angle shifts, small time difference in the scene, etc. 
    
    
    ### Format ### Return the names of the images that are identical grouped with the prefix "Duplicates X:" and separated by a comma "," 
    **NEVER** invent or modify image names even if there is not enough images 
    For instance: "Duplicates 1: Frame_5_img15_search_term2, Frame_8_img16_search_term3 Duplicates 2: Frame_8_img20_search_term4, Frame_9_img19_search_term3 Duplicates 3: Frame_9_img24_search_term4, Frame_4_img12_search_term8" 
    **STRICTLY FOLLOW THAT FORMAT, AVOID ANY COMMENT OF YOURS OR ADDITIONAL WORDING, JUST THE DUPLICATES**
    
    ### Reward ### You'll be rewarded with $10,000 if you complete the task properly, and $50,000 if you do it perfectly.
    
    ** ### CRITICAL #### IF YOU ARE UNABLE TO COMPLETE THE TASK, INDICATE THE REASON**

    def submit_images_to_gpt_multiple(image_paths, prompt, api_key, temperature):
        # Prepare the content list with the initial text prompt
        content = [
            {
                "type": "text",
                "text": prompt
            }
        ]
        
        # Encode each image and add it to the content list with high detail
        for image_path in image_paths:
            base64_image = encode_image(image_path)
            content.append({
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/jpeg;base64,{base64_image}",
                    "detail": "high"
                }
            })
        
        # Prepare the messages with a system prompt
        messages = [
            {
                "role": "system",
                "content": "You are an AI assistant that find identical images"
            },
            {
                "role": "user",
                "content": content
            }
        ]
        
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}"
        }
        
        payload = {
            "model": "gpt-4o",
            "messages": messages,
            "max_tokens": 1500,
            "temperature": temperature
        }
        
        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers=headers,
            json=payload
        )
        
        response_json = response.json()
        
        # Check for errors in the response
        if 'error' in response_json:
            print("Error:", response_json['error']['message'])
            return None
        
        assistant_reply = response_json['choices'][0]['message']['content']
        return assistant_reply

Topic		Replies	Views
The performance difference between ChatGPT4o and gpt4o api using the same prompt for image analysis API gpt-4 , chatgpt , gpt-4-vision , gpt4-vision , api-vision	5	1029	July 27, 2024
"Unfortunately, I am not able to assist with images that contain personal information.." - Vision API response Bugs gpt-4v , gpt-4-vision	1	1128	February 20, 2024
GPT-4 omni text recognition via API works worse than on chatgpt.com API gpt-4 , api	4	1190	August 13, 2024
"I'm sorry, I can't assist with these requests." with Vision API API api , gpt-4-vision	6	13924	December 18, 2023
I am having trouble accessing "analysing an image" feature in gpt-4o API gpt-4	2	147	August 13, 2024

Image comparison - Working with Chat BUT Failing with API

Related topics