Hello OpenAI community,
I’m currently working on a project using the GPT-4o-mini API to analyze images. I’ve noticed inconsistencies in the model’s ability to analyze certain images, and I’m hoping to get some clarification and advice from the community.
Context
- Model used: GPT-4o-mini
- Task: Image analysis from URLs
- Method: Using the ChatCompletion API with the
image_url
parameter - Detail level: Low (as specified in the API call)
Observed Problem
Some images are successfully analyzed, while others generate a response indicating that the model is not able to analyze images.
Here is an example : “I can’t view or analyze images directly, but if you provide details or text from the image, I can help you understand or summarize that information.”
What’s intriguing is that:
- The same images that fail to be analyzed once can be successfully analyzed at other times.
- The same code works perfectly with a superior model (GPT-4o), without any analysis issues, also using low detail.
Example API Call
response = openai.ChatCompletion.create(
model="gpt-4o-mini",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What are the info on this image?"},
{
"type": "image_url",
"image_url": {
"url": image_url,
"detail": "low"
}
}
]
}
],
max_tokens=500,
)
Questions
- Are there specific limitations for GPT-4o-mini in terms of image size, format, or complexity when using low detail that I should be aware of?
- Are there any best practices for optimizing image analysis with this particular model and detail level?
- Are there any known issues or limitations with GPT-4o-mini regarding image analysis at low detail that could explain this inconsistent behavior?
Any information or advice would be greatly appreciated. If additional details are needed, I’d be happy to provide them.
Thank you in advance for your help!