GPT-4-Turbo and GPT-4-O benchmarks released! They do well compared to the marketplace

duncan.haywood · May 13, 2024, 7:12pm

Hi all,
I’m happy to say that the benchmarks on the gpt-4-turbo and gpt-4-o models were finally released by OpenAI and they both do pretty well.

openai/simple-evals (github.com)

Additionally, we have results on the LMSYS leaderboard now for subjective preferences:

https://leaderboard.lmsys.org/

whyjpwhybot · May 13, 2024, 8:24pm

Will it have a free version of gpt-4-o? Or is it only for people who pay?

NotFenixio · May 13, 2024, 8:33pm

If you’re going to use it through the API of course it’ll be paid, as all products.

But, according to the announcement, we should get the free version in ChatGPT (new domain, yay!) in the next days.

btrower · May 13, 2024, 9:08pm

The first question I posed to ChatGPT-4o was answered quite quickly. Thereafter, though, it got slower and slower until now I have to go away and do something else while I am waiting. This whole answer was typed in while waiting and I’m still waiting.
Update: Waited for a long time. Still waiting…
Stopped and started it again. System is not very responsive.

duncan.haywood · May 16, 2024, 9:06pm

It’s possible that they are experiencing load balancing issues while they roll out the deployments of this new model as traffic increases.

duncan.haywood · May 16, 2024, 9:07pm

It’ll be free at limited usage in chatgpt, but in the api, it’s paid, though it’s cheaper than the last model.

btrower · May 17, 2024, 4:18pm

Well, thanks for the ‘heads up’. I get that things don’t always go smoothly. I am back here again while I am waiting on ChatGPT-4o to complete. It’s been a while and it is stuck with a symbol that seems to indicate it believes it is continuing to work, but the output remains stuck where it was about ten minutes ago. I am on the paid plan, though it shouldn’t make a difference I would think as to fundamental ‘workingness’. If it’s a load issue, then the system should indicate that there will be a delay. Also, since it seems entirely with no hope of coming back, I am stuck finally stopping and trying again. From my experience thus far I have to start a new session.
Given that there are issues here, someone should tweak the interface so that it gives some ongoing indication of its status as it works so at least if it’s frozen it is clear it is no longer updating and that it should be stopped and tried again. Just checked and it is frozen, so trying again. Very slowly chugged away and then stopped again. Of the AIs that I have been using in recent months (Poe, Gemini, Claude, GPT-4, Groq etc.) this is the slowest thus far. There must be thousands of programmers like myself using this system. Perhaps you should reach out and ask for assistance?

btrower · May 17, 2024, 4:52pm

Update: I was unable to get ChatGPT-4o to complete the thing I was doing in Chrome. When I looked at the developer tools there were a variety of errors, most seemingly due to issues with cross-site scripting. Rather than trouble-shoot that, I opened with FireFox and got it working again, albeit at a glacial pace (was still working when I was doing this update).
There were a variety of errors, the most worrisome performance-wise is attempts to read files from other sites for things like fonts which should not be needed at all. Communications is one of the very slowest and flakiest things in computing and should be avoided whenever possible. One of the main reasons that Groq is so fast is that they designed out most of the communications precisely to increase performance.
Anyway, all’s well the ends well. I got the one thing to finally finish.

Topic		Replies	Views
GPT-4-Turbo models perform better the older GPT-4 models in LMSys benchmark API gpt-4 , api	14	6683	May 13, 2024
List of fresh gpt-4o benchmarks, please add Community gpt-4o	1	3507	May 16, 2024
Blown Away by CHAT GPT with GPT4 Speed! 🚀 Community gpt-4 , chatgpt	12	3847	December 24, 2023
GPT 4 API is Very Slow Still API gpt-4 , chatgpt , api	15	6804	December 16, 2023
GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses API gpt-4	38	15067	June 11, 2024

GPT-4-Turbo and GPT-4-O benchmarks released! They do well compared to the marketplace

Related topics