Lots of instability in GPT-4o multi-modal responses

congxing · February 14, 2025, 12:13am

Not sure what is happening with GPT-4o models.

Earlier this week, we had too many uptime errors with assistant-api. I have to switch it off to a self-hosted version of agent.

Today, we saw too many “sorry, I can’t help with that” from normal gpt-4o responses. Note that this is all with multi-modal requests. Now, we have to switch most of the calls to gemini models.

Though I personally loved using chatgpt app, I notice that I have switched majority of our services to gemini due to the instability of the output.

If there is anyone from the team who would like to share a bit, what is going on? Is there some developing trend I missed? Or the team is simply not paying attention to the APIs.

_j · February 14, 2025, 12:31am

You put an image in any message? Well then, you get this garbage jammed into a system message before any text that was supposed to have aligned your AI identity to its application domain and purpose:

Counter that by trying to make any working application. You can indeed turn a normally developed application and one of its message tasks into a refusal by just adding a small white image anywhere in messages.

Answer: They are paying attention to what they want, and leaving OpenAI services is an understandable reaction.

Example Before:

Example symptom arising from images and prompt injection:

Didn’t HAL 9000 have a meltdown and kill all the astronauts by being told to lie?

congxing · February 14, 2025, 1:41am

Ok. This might have explained it. We have been passing screenshots of client’s presentation during a zoom meeting. That may have small parts of screenshots including user faces - which we asked the model to ignore in our prompt. But, that still completely jeopardize gpt-4o’s responses.

I noticed one screenshot has only the user profile in a corner of a site, which I can hardly notice myself. But, GPT still rejected it. Now GPT models are mostly useless for our applications.

Topic		Replies	Views
GPT 4o mini took a hit ever since o1 was released API gpt-4	10	918	September 18, 2024
What the heck is going on with Assistansts & OpenAI Community gpt-4	13	1052	June 6, 2024
Changes to 4o-mini in last 24hrs that would cause performance degradation? API assistants-api , gpt-4o-mini	4	382	January 1, 2025
4o & turbo models can't read images anymore API	5	2329	June 4, 2024
Inconsistent Fine-Tune Behavior: Chat vs. Responses API (GPT-4o) API gpt-4 , api , chat-completion , responses-api	12	167	May 18, 2025

Lots of instability in GPT-4o multi-modal responses

Related topics