GPT-4-Turbo models perform better the older GPT-4 models in LMSys benchmark

I use them both. Have to. 90% Opus, 10% GPT.

I explained with a CSS example in another comment here: How to deal with "lazy" GPT 4 - #134 by Diet

And I’ll say that’s still true. But for another example, I attempted to have Opus write some Django backend/database stuff for basic Google SSO, and if I had used its code, it would have borked my entire backend. I switched to gpt-4 and the model got it right on the second go.

At first I thought it was that gpt-4 was more “logical” but I dont’ think that’s the right way of putting it. It’s odd. Maybe just a difference in training data. I still use Opus for most things.