A competitive warning to my friends at OpenAI

We have been running an AI Chat service for over a year now. It was originally built on top of AI’s assistants API and then rolled over to the responses API when that became preferred last year. At some point, we saw the reliability low enough that we decided to add support for Google as a fallback generator so when OpenAI timed out or failed, we would use Google so that our users remained unaware of the problem. (That moved our reliability from 95% to 99.5%.)

But OpenAI was clearly the better provider. HOWEVER, in our next release due next week, we are shifting our primary provider from OpenAI to Google. That means that we’ll be using OpenAI only when Google times out or fails. Essentially we’ll be shifting most of our business to Google. I regret that because I prefer the OpenAI community and mindset. But, at least in our application, this choice is easy. In our application, gemini-3-flash is at least as good at gpt-5-mini in terms of response quality. And Google is more than twice as fast as OpenAI for the types of generation we are doing. Prices are slightly higher from Google, but not by enough to offset the speed improvement for us.

This is NOT meant as some put-down for OpenAI. On the contrary, I offer this warning to my OpenAI friends because we hope that this situation can be reversed. Being more than 2x slower in a real-time chat application just doesn’t cut it. In my opinion, the OpenAI obsession with new features has interfered with your ability to remain the leader in the basics.

I hope your upcoming efforts will allow us to shift back soon.

7 Likes

I’ll make sure this message gets seen by staff!

Thank you for the kind words. We’ll be here, and best of luck with your endeavors!

5 Likes

I have to say I agree with you.

The same model running in ChatGPT chat interface seems to respond significantly quicker than if run through a 3rd party chat UI using the Responses API. That is, until your conversation gets long and the ChatGPT UI becomes almost unresponsively slow. The dreadful web application implementation was actually my primary motivation for writing my own UI and going through the API instead. But then I added support for Gemini 3s API too.

Compared with the $1.4 trillion OpenAI is supposedly investing between now and 2030, you would think that the cost of hiring a small but talented team to write a decent web app that does not suffer from this slow down would barely register.

Similarly, we know that matching the response times in the API is definitely possible.