GPT-4-Turbo: A Step Back in Logic and Consistency?

LCPRESET · November 7, 2023, 9:23am

I replaced the GPT-4 model in my law reasoning/analysis system with a preview version of GPT-4-Turbo (“GPT-4-1106-preview”). After basic initial tests, it seems that while it is faster and much cheaper, it is unfortunately significantly degraded in quality compared to GPT-4.

The main issues are:

Impaired logical reasoning:
A significant portion of my work involves the analysis of legal acts, and its performance is noticeably inferior compared to that of GPT-4.

Increased variability in responses (less deterministic):
The responses fluctuate drastically when provided with the same prompt and parameters, especially when the LLM is instructed to perform scoring (how relevant is specific article).

I hope these issues are resolved with the release of the stable version promised in “a few weeks.” If not, then GPT-4-Turbo seems more akin to a “GPT-3.8-Turbo,” which could be useful in some cases, but not for the work that I am doing.

youssefarizk · November 7, 2023, 9:25am

have the opposite problem with normal gpt-4, all of a sudden response are completely deterministic

natanael.wf · November 7, 2023, 9:51am

New model releases should consistently include benchmarks for a clear comparison of the changes between them. It seems to me that OpenAI only releases benchmark data when a new model outperforms its predecessor. There’s a tendency to withhold such information when introducing models with fewer capabilities. During yesterday’s presentation, Sam claimed that GPT-4 turbo is the most advanced version yet, but I’m skeptical without the hard data to back it up.

It’s become apparent to many that GPT-4’s performance has declined since its release in March, and OpenAI remains silent about it.

It looks like cutting costs has become the main focus, which is quite disappointing.

dmfenton · November 9, 2023, 2:07am

I am seeing it repeatedly make the same mistake when generating sql despite few shot examples and instructions about how to avoid the mistake.

Topic		Replies	Views
GPT 4 Turbo regression over GPT 4 API gpt-4 , gpt-4-turbo	2	1575	November 24, 2023
Gpt 4 Quality Fluctuation API gpt-4	5	980	May 1, 2024
Is GPT4 turbo really smarter? Community gpt-4-turbo	3	2507	November 28, 2023
GPT-4-Turbo models perform better the older GPT-4 models in LMSys benchmark API gpt-4 , api	14	6540	May 13, 2024
Huge quality drop in gpt-4-turbo Bugs	13	1063	May 30, 2024

GPT-4-Turbo: A Step Back in Logic and Consistency?

Related topics