Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality

alexandrecougnenc · November 7, 2023, 3:14pm

Hello,
I’ve been doing a lot of testing since yesterday and so far the new version ChatGPT 4 0611 is really not as good in terms of response quality as Chat GPT4.

Any ideas for improving the quality of responses? Any particular prompts or instructions?

Personally, I’m still looking.

tangmingxyz · November 7, 2023, 3:32pm

There is no ChartGPT 4… If you are using API, you can still use GPT-4 API, it’s still available.

aletheus · November 7, 2023, 3:38pm

I asked the model with web browsing capabilities, and it has affirmed there is delay and confusion about version deployment. For example I’m still seeing the drop down menu in my end user chat GPT chat window. I can access all of the features in the assistant build dashboard, but anytime I attempt to upload a file for inspection with my new assistant, it stalls out. I haven’t even gotten an error log yet. However the model is responsive through normal conversation.

alexandrecougnenc · November 7, 2023, 9:00pm

i don’t speak of ChartGpt ^^ and yes i know for the API but it’s a shame… They launch a version that is “2x better” but it doesn’t

pedromatoso · November 7, 2023, 9:11pm

Got any examples where GPT-4 Turbo doesn’t answer as well as GPT-4?

anjula · November 8, 2023, 6:11am

I also had a try yesterday. Using the GPT4 Turbo model, some of the responses I receive to my inquiries are inaccurate. However, the answer is right when I try the same question again.
When I use gpt4, my daily token cost is less than it is on other days.
How can I make it more accurate the first time? What actions can I take to deal with it?

EduGPT · November 8, 2023, 2:22pm

Yes, quality is much lower. I mean is 3x cheaper. Feels like a little better 3.5.

j.deluca · November 8, 2023, 3:05pm

My whisper post-processing task with GPT-4 just doesn’t work anymore. Answers absolutely unexpected

trenton.dambrowitz · November 8, 2023, 3:18pm

Odd, I used the GPT4-Turbo API for my whisper meeting minutes and it did a fantastic job.

The transcription came out to around 9k tokens, I was struggling to get old GPT4 to do it earlier on Monday

o.auth · November 8, 2023, 9:49pm

Very interesting. I very often observe that a specific request receives a rather general answer which doesn’t seem completely inadequate, but it does not contain the exact information asked for. Then I ask the same question again and I usually get the answer. Interstingly the “geralized” parts appear very quickly while more “personalized” parts take much longer. Potentially performance optimization, but this would definately be on the expense of respond quality.
I also find it suspicious that it is called “Turbo” - “Turbo” having been the “budget-version” so far.
What do you experience with GPT-4 Turbo?

happslabs · November 15, 2023, 11:10pm

Impressive delays, degraded speed and quality of response felt here. ChatGPT launched from 0 - 100 and now down to 15 in my view. Would be good to know if this is a temporary issue or just the new normal.

trenton.dambrowitz · November 16, 2023, 8:12am

Likely a new normal for the current models, if I was to speculate I’d say OpenAI has been restricting and tuning their models in an attempt to stay out of the negative press and relieve some of the social (and soon to be regulatory) pressure.

Hopefully we’ll get a new Bing Chat moment when the next big model launches and it’ll be up for anything

Foxalabs · November 16, 2023, 8:25am

In every test I have performed and with 4 project implementations and a bunch of customer installs the turbo model has either equalled or significantly improved upon GPT-4’s last outing. (Current server overload issues notwithstanding)

If anyone has a prompt that worked on GPT-4-8K or 32K that no longer works with GPT-4-Turbo I’d be happy to spend time with them to help them resolve their issues.

N2U · November 16, 2023, 2:43pm

I have prompt that used to work but it’s rather long, so I’ve compressed it a bit:

Write an openapi schema a according to the API specifications delimited '#'

###
<copy-paste from home assistant doc's>
###

Here’s the documentation in question:

Usually I would get something that resembles the correct openAPI schema, but now I get instructions on how to write it myself

trenton.dambrowitz · November 16, 2023, 2:46pm

My favourite type of code, a to-do list!

N2U · November 16, 2023, 3:19pm

I mean, they’re good instructions, but the idea here is to bamboozle GPT into working for me, not the other way around

That said though, I did try again just now with the pure markdown of the documentation from this page:

https://raw.githubusercontent.com/home-assistant/developers.home-assistant/master/docs/api/rest.md

And that seems to have helped a lot, so input quality definitely has something to do with it

Update: Yeah, turns out this just needs a more affirmative system prompt and removing the description on how to make the authentication request because GPT cannot write this without my auth token.

zarko.rashev · November 16, 2023, 9:46pm

You can only imagine what they did to the quality in order to make it 3x cheaper.

Bennet · December 4, 2023, 10:59am

I also feel like gpt-4-turbo was sparsified so much, it is now a better 3.5 but not a whole different level anymore.

It broke many of my prompts and use-cases, by being unpredictable and much less adherent to details in the prompt. It is much less “deep” and often simply feels as dumb as 3.5 because it also does not see nuances etc. - it will only do obvious stuff correctly and reliably.

Much less magic Hope there will be a more expensive model option in the future that is as good as old gpt-4 but also up to date.

alexandrecougnenc · December 4, 2023, 12:58pm

Yes exactly the same feelings.
I have made so many try with GPT 4 Turbo, but he is so bad.
I come back to the expensive GPT 4…

tom_t · December 5, 2023, 11:51am

It might be like with washing machines - now selecting 60 degress doesn’t mean it will be 60 - due to energy saving they are trying to achieve the same effect as with 60… What if gpt-4-turbo is just gpt-3.5 with additional prompt optimization? For ChatGPT they already almost for sure do it.

Topic		Replies	Views
GPT-4 128K only has 4096 completion tokens API gpt-4	9	27345	February 27, 2024
GPT 4 Turbo is limited to 4K? API gpt-4	16	14203	April 9, 2024
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	14124	January 11, 2024
Is it me or GPT4 consistently doesn't finish and cuts the answers? API	18	6705	April 11, 2024
ChatGPT-4 Limits? Are they the same as for ChatGPT-3.5? API	12	8736	December 12, 2023

Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality

Related topics