I don’t think the current available GPT-4 models can get much faster because it’s simply so gigantic. (edit: they likely will, over time, but it’s unlikely that they offer a slowed down version and a faster version to higher paying customers - because faster versions would generally be cheaper to run - so if they had it, they would probably offer it as soon as they could) You can try and see if azure has better performance, but the difference is probably marginal.
No telling what openAI may decide to do - whether they release a gpt-4-turbo or whether the gpt-3.5 versions will keep catching up with older gpt-4 versions.
In any case, in my opinion the current best suggestion is to limit gpt-4 output length as much as possible, and use 3.5 wherever you can.
really depends on your use-case and what you need gpt-4 for.
The Enterprise ChatGPT says it can be up to twice as fast. That is not the API though.
I’d like you to read the February announcement of ChatGPT plus, and see if a single promise there is true about the increased speed you’d get. Expect the same with the vaporware of ChatGPT enterprise, where you don’t get called back unless you pick the 10000+ seats box and your name is Boeing.