Worse results when using GPT-4o as an evaluator

anon25271712 · October 1, 2024, 2:17am

It would be nice to view how much longer did gpt-4 take compared to gpt-4o. If I recall correctly, gpt-4o is a quantization of gpt-4-turbo. But I could be wrong about this!

I’ve posted a comment on a topic comparing both models. The chart I provided was published by OpenAI on github, if I recall correctly, it was the simple-eval repository.

Topic		Replies	Views
GPT4o - performance comparison with GPT4 API gpt-4	0	698	June 7, 2024
GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses API gpt-4	38	15058	June 11, 2024
Comparing GPT-4 to GPT-4o API gpt-4	4	1859	May 14, 2024
List of fresh gpt-4o benchmarks, please add Community gpt-4o	1	3503	May 16, 2024
Gpt-4o tokens per second comparable to gpt-3.5-turbo. Data and analysis API gpt-4 , gpt-35-turbo , playground , gpt-4-turbo , gpt-4o	3	12874	August 16, 2024

Worse results when using GPT-4o as an evaluator

Related topics