ChatGPT turbo API not response with right answer as default ChatGPT and ChatGPT-4

cghui.sunshine · March 19, 2023, 3:48pm

For this question:
早晨7:30,全国马拉松锦标赛(无锡站) 2023COLMO无锡马拉松暨布达佩斯世锦赛及杭州亚运会马拉松选拔赛·大运河马拉松系列赛 (无锡站)鸣枪起跑。上面这条新闻里,是一天三场马拉松吗?

Default ChatGPT and ChatGPT-4 gave the right answer, but Legacy ChatGPT and ChatGPT turbo API gave the wrong answer.

I have a question: It seems that ChatGPT turbo API is the same as Legacy ChatGPT, not the Default ChatGPT

AgusPG · March 19, 2023, 5:51pm

I don’t know what “Default ChatGPT”, “ChatGPT-4”, “Legacy ChatGPT” and “ChatGPT turbo” are.

If you’re referring to requests submitted via API, can you please refer to the actual name of the models in the model parameter so we can help you understand the differences?

cghui.sunshine · March 19, 2023, 6:07pm

“Default ChatGPT”, “ChatGPT-4”, “Legacy ChatGPT” are there ChatGPT version for Plus user. And the ChatGPT API model I take are gpt-3.5-turbo and gpt-3.5-turbo-0301. The API and Legacy ChatGPT have similar wrong answer.

AgusPG · March 19, 2023, 6:37pm

Thanks for the clarification. It makes sense that gpt-4 model is better on average, as it’s the new one. Regarding the others: they are probably similar versions of the same base model, but you might still find differences.

Especially because gpt-3.5-turbo in the API depends on a system message that might not be the one being used in ChatGPT interface with “Default” model.

Generally, I discourage experimenting via ChatGPT interface in terms of developing and retrieving reproducible results. We have the Playground and the API for this purpose.

cghui.sunshine · March 19, 2023, 6:42pm

The most weird one is that the Legacy and Default ChatGPT does not agree with each other in this specific question, It should be similar, but It give totally contrary answer. I thought it be a bug, not just because of model difference

AgusPG · March 19, 2023, 9:54pm

I see your point. However, these models have not been trained to be factual, but to produce sequences of tokens that sound reliable, given the existing context.

Slight model/parameters differences can result in huge output differences. The accuracy of the generated response is just not a priority for these models (even though this policy can be corrected after pre-training via RLHF and other techniques).

Topic		Replies	Views
Different output generated for same prompt in chat mode and API mode using gpt-3.5-turbo API gpt-35-turbo , chatgpt , api	16	10254	December 18, 2023
Chat completions API can't answer its model correctly API	4	1185	December 17, 2023
The inconsistent responses between api and the website version API gpt-4 , chatgpt , api	11	2554	December 14, 2023
Why is there a difference in ChatGPT web version vs gpt 3.5 api model (gpt-3.5-turbo /text-davinci-003) API	6	7116	January 21, 2024
Why is GPT 4's response and performance on playground is so different from when using chatgpt 4 API gpt-4	12	8357	December 16, 2023

ChatGPT turbo API not response with right answer as default ChatGPT and ChatGPT-4

Related topics