Hello,
I’ve been doing a lot of testing since yesterday and so far the new version ChatGPT 4 0611 is really not as good in terms of response quality as Chat GPT4.
Any ideas for improving the quality of responses? Any particular prompts or instructions?
I asked the model with web browsing capabilities, and it has affirmed there is delay and confusion about version deployment. For example I’m still seeing the drop down menu in my end user chat GPT chat window. I can access all of the features in the assistant build dashboard, but anytime I attempt to upload a file for inspection with my new assistant, it stalls out. I haven’t even gotten an error log yet. However the model is responsive through normal conversation.
I also had a try yesterday. Using the GPT4 Turbo model, some of the responses I receive to my inquiries are inaccurate. However, the answer is right when I try the same question again.
When I use gpt4, my daily token cost is less than it is on other days.
How can I make it more accurate the first time? What actions can I take to deal with it?
Very interesting. I very often observe that a specific request receives a rather general answer which doesn’t seem completely inadequate, but it does not contain the exact information asked for. Then I ask the same question again and I usually get the answer. Interstingly the “geralized” parts appear very quickly while more “personalized” parts take much longer. Potentially performance optimization, but this would definately be on the expense of respond quality.
I also find it suspicious that it is called “Turbo” - “Turbo” having been the “budget-version” so far.
What do you experience with GPT-4 Turbo?
Impressive delays, degraded speed and quality of response felt here. ChatGPT launched from 0 - 100 and now down to 15 in my view. Would be good to know if this is a temporary issue or just the new normal.
Likely a new normal for the current models, if I was to speculate I’d say OpenAI has been restricting and tuning their models in an attempt to stay out of the negative press and relieve some of the social (and soon to be regulatory) pressure.
Hopefully we’ll get a new Bing Chat moment when the next big model launches and it’ll be up for anything
In every test I have performed and with 4 project implementations and a bunch of customer installs the turbo model has either equalled or significantly improved upon GPT-4’s last outing. (Current server overload issues notwithstanding)
If anyone has a prompt that worked on GPT-4-8K or 32K that no longer works with GPT-4-Turbo I’d be happy to spend time with them to help them resolve their issues.
And that seems to have helped a lot, so input quality definitely has something to do with it
Update: Yeah, turns out this just needs a more affirmative system prompt and removing the description on how to make the authentication request because GPT cannot write this without my auth token.
I also feel like gpt-4-turbo was sparsified so much, it is now a better 3.5 but not a whole different level anymore.
It broke many of my prompts and use-cases, by being unpredictable and much less adherent to details in the prompt. It is much less “deep” and often simply feels as dumb as 3.5 because it also does not see nuances etc. - it will only do obvious stuff correctly and reliably.
Much less magic Hope there will be a more expensive model option in the future that is as good as old gpt-4 but also up to date.
It might be like with washing machines - now selecting 60 degress doesn’t mean it will be 60 - due to energy saving they are trying to achieve the same effect as with 60… What if gpt-4-turbo is just gpt-3.5 with additional prompt optimization? For ChatGPT they already almost for sure do it.