List of fresh gpt-4o benchmarks, please add

The dangers of using lmsys leaderboard: huge drop in 4o score from openai’s rather excitable initial tweet x.com

4 Likes