O1 API vs ChatGPT.com: Why is the quality so different ? Seeking insights!

Pierre-Jean · October 10, 2024, 2:33pm

Hello,

I’m experiencing significant differences in quality for the same prompt when executed via chatgpt.com (with o1) versus through the API (still with o1).

The prompts are identical, and I allocated 30,000 tokens to the API for reasoning (max_completion_tokens). The result obtained via chatgpt.com is far superior.

On the API side, I’m getting results that are more or less similar to those I used to get with GPT-4-turbo.

I’m conducting a sectoral analysis based on local data (about 17,000 tokens for the initial prompt).

Do you have any suggestions or insights? Do you know if chatgpt.com performs any pre-processing that differs from a direct API call, or if other parameters could be influencing the results?

Thanks for your help!

Pierre-Jean · October 10, 2024, 4:06pm

Beyond the quality, the length of the completion provided via the API puzzles me: it’s barely a third of what I get from chatgpt.com…

Knowing that the o1 beta doesn’t have any parameters (temperature, frequency, etc.), I don’t understand where the difference is coming from or how I can get closer to the completions provided by chatgpt.com.

Am I the only one here?

I’ve seen that some people had this issue with DALL-E, and a “revised_prompt” parameter helped to identify the differences, but this parameter doesn’t exist in the completion API

Naemy · October 13, 2024, 5:41pm

I have the same issue. I allocated 32 000 tokens to o1-preview API, but the quality is so far below the quality I get from the o1-preview I have access as a Chat GPT plus user.

The answers are much shorter and don’t go as much in depth.

Pierre-Jean · October 14, 2024, 6:33am

Hello @Naemy , do you also have the same issue comparing others models from API vS GPT subscription ?

I have the impression that this behaviour is the same whatever the model is.

I will proceed further tests today and post it here. I also have to test Playground VS API VS chatgpt sub…

Anybody would have some advices in order to align quality of results ?

Have a great day,

Pierre-Jean

Any

Naemy · October 14, 2024, 1:20pm

Hello,

I didn’t test other models extensively as I mainly got the API for o1-preview, but I got the impression the o1-mini answers were also shorter.

I use it mainly for programming, and the o1-preview API’s performance is underwhelming.
I tested several platforms, but it cannot perform the same way as o1-preview on OpenAI chatGPT even with 32 000 max tokens.

Let us know for the tests, I am interested.
Have a great day too.

evrard.tserstevens · January 22, 2025, 9:39am

Hello @Pierre-Jean ,

I am facing the same issue, really short answer when using the API with the o1-mini model compared to when I use the ChatGPT application.

Have found a solution / explanation for this ?

Thanks a lot,

Evrard

Topic		Replies	Views
Why Does OpenAI's API Struggle to Match ChatGPT's Commercial Response Quality API gpt-4 , chatgpt , api	9	1223	May 1, 2025
Longer responses to manual prompts on Chat GPT than API? API	8	3413	October 26, 2023
ChatGPT vs gpt-4o via APIs show noticeable quality difference Community chatgpt	7	9144	May 27, 2024
Disparity between o1-preview results in ChatGPT vs API API chatgpt , api , o1-preview	0	227	October 22, 2024
Completions API response always considerably shorter than ChatGPT response Prompting api	4	1744	March 30, 2024

O1 API vs ChatGPT.com: Why is the quality so different ? Seeking insights!

Related topics