ChatGPT turbo API not response with right answer as default ChatGPT and ChatGPT-4

For this question:
早晨7:30,全国马拉松锦标赛(无锡站) 2023COLMO无锡马拉松暨布达佩斯世锦赛及杭州亚运会马拉松选拔赛·大运河马拉松系列赛 (无锡站)鸣枪起跑。上面这条新闻里,是一 天三场马拉松吗?

Default ChatGPT and ChatGPT-4 gave the right answer, but Legacy ChatGPT and ChatGPT turbo API gave the wrong answer.

I have a question: It seems that ChatGPT turbo API is the same as Legacy ChatGPT, not the Default ChatGPT

I don’t know what “Default ChatGPT”, “ChatGPT-4”, “Legacy ChatGPT” and “ChatGPT turbo” are.

If you’re referring to requests submitted via API, can you please refer to the actual name of the models in the model parameter so we can help you understand the differences?

“Default ChatGPT”, “ChatGPT-4”, “Legacy ChatGPT” are there ChatGPT version for Plus user. And the ChatGPT API model I take are gpt-3.5-turbo and gpt-3.5-turbo-0301. The API and Legacy ChatGPT have similar wrong answer.

Thanks for the clarification. It makes sense that gpt-4 model is better on average, as it’s the new one. Regarding the others: they are probably similar versions of the same base model, but you might still find differences.

Especially because gpt-3.5-turbo in the API depends on a system message that might not be the one being used in ChatGPT interface with “Default” model.

Generally, I discourage experimenting via ChatGPT interface in terms of developing and retrieving reproducible results. We have the Playground and the API for this purpose.

The most weird one is that the Legacy and Default ChatGPT does not agree with each other in this specific question, It should be similar, but It give totally contrary answer. I thought it be a bug, not just because of model difference

I see your point. However, these models have not been trained to be factual, but to produce sequences of tokens that sound reliable, given the existing context.

Slight model/parameters differences can result in huge output differences. The accuracy of the generated response is just not a priority for these models (even though this policy can be corrected after pre-training via RLHF and other techniques).

1 Like