Object color extraction from images in gpt4o?

patrickmckenzie · October 16, 2024, 6:42pm

Hi everyone,

I have been interested in extracting color data from images using gpt4o-vision. I’m currently sandboxing through the chatgpt mac app.

I of course find that gpt4o does a great job of describing colors in an image. For example, in the screenshot below, I ask “what is the color of the star in this image” and it correctly and succintly answers “light purple.”

However, if I just slightly modify my question to ask “what is the RGB color value of the star in the image,” GPT pivots to a python analysis (pasted below), which fails to even capture the star.

I’m curious about what is so different under the hood between what gpt4o is doing in its first (accurate) response, versus what it’s doing in its second, clunky response that wraps python and fails to locate the star. If it uses the first, non-python approach, does that preclude it from extracting literal values from the image like RGB color data, and why? Is that approach also precluded from extracting other information, like location information (e.g. getting a correct description for “what pixel values contain the star” so I could pivot to another program to extract the colors of these pixels).

I don’t have an AI background and just use it as a tool to help with my biology research. Results like this leave me wondering about options for gpt4o’s use in downstream color extraction. Clearly, it can recognize objects in photos and report their categorical colors quite accurately. This is hugely helpful for me! But obviously my next goal is to extract color values from the pixels underlying objects more precisely. This of course is possible without gpt by using point-and-click approaches, but I really value the capacity of gpt to automate the process across really big datasets (removing the manual pointing and clicking part of object detection and data extraction).

Thanks so much!

GPT python code (failed to capture star and yielded 255,255,255):

from PIL import Image

# Load the image
image_path = "/mnt/data/star.jpg"
image = Image.open(image_path)

# Convert image to RGB
rgb_image = image.convert("RGB")

# Let's define an area around the star to sample the color
# For simplicity, we'll sample a small region in the center of the star
star_area = (430, 180, 470, 220)  # Assuming coordinates around the star

# Crop the star area
cropped_image = rgb_image.crop(star_area)

# Get the RGB values by averaging the color in this area
pixels = list(cropped_image.getdata())
average_rgb = tuple(sum(x)//len(x) for x in zip(*pixels))

average_rgb

Topic		Replies	Views
The performance difference between ChatGPT4o and gpt4o api using the same prompt for image analysis API gpt-4 , chatgpt , gpt-4-vision , gpt4-vision , api-vision	5	1136	July 27, 2024
Getting GPT Vision To Return Coordinates Prompting gpt-4 , gpt-4-vision	10	8908	July 30, 2025
How can I get gpt-4o or gpt-4o-mini to analyze user images? API gpt-4o , gpt-4o-mini	3	1250	October 13, 2024
Make OpenAI Vision API Match GPT4 Vision API chatgpt	4	3947	December 6, 2023
What are the APIs for image analysis? API gpt-4 , api	2	10592	May 17, 2024

Object color extraction from images in gpt4o?

Related topics