One thing I like about swe-bench is that this it’s a pretty good way to compare claude versus GPT4
This echos my experience as well. I don’t know why people are trash talking GPT4 in this forum.