There has never been any secret about GPU’s being the current limiting factor, Sam and Greg have tweeted about it several times.
All R&D centric companies spend money when they have capital invested for R&D, this is not an unusual practice.
OpenAI is primarily a research and development company at this stage, might change as AI usage grows, but that is a future consideration and explains why they spend money on R&D.
Microsoft is an investor, with a 100x limited return. I don’t know what goes on behind closed doors, but both Microsoft and Sam have said that MS is not controlling the R&D done at OpenAI, they provide large amounts of compute and cash, those are the needed commodities at this stage.
The introduction of the iOS app has not made much of a difference to the user numbers I am seeing in the Discord, and the general questions and answers being shared are typical of an active ChatGPT and API user base. Every new user is a strain on limited resources, that is no secret either.
You are correct that for the majority of users their technical requirement of the model are not as high as that for specific power users, devs, researchers, scientists, tech based business, etc. etc.
Traffic limitations are simply a way to manage resources.
Speed will improve through a combination of additional hardware and model inference tuning.
If you have a usable set of prompts with typical expected replies to test model releases, it would be great if you could add them to the OpenAI Evals, that way there are less people unhappy with new updates.
I think OpenAI has been the most upfront and honest big company I have ever dealt with.
Azure is a great option if you are moving to production.