GPT-4o testing o1-reasoning

GrandPrixTournament · November 11, 2024, 12:20pm

So apparently OpenAI is testing the integration of o1-capabilities into GPT-4o . Many will know that from time to time, GPT-4o will provide two answers, one of which can then be chosen by the user as the preferred one. This serves to improve answers in the long run. Interestingly, today I encountered a very first; In my GPT-4o chat, one of the answer-options was generated using the o1-capabilities.

EricGT · November 11, 2024, 12:38pm

Welcome to the forum!

Thanks for sharing.

Is o1-capabilities something that you saw on the screen or was it something you conjectured? I ask because o1-<xyz> is a naming scheme and that can easily confuse many if you conjectured it.

GrandPrixTournament · November 11, 2024, 12:46pm

You could say it is something I conjectured based on observations that leave no other explanation that would make sense. Let me elaborate:
1.) The answer generation took unusually long, which I initially blamed on a possible connection issue.
2.) However, when I looked at the generated answers, one of them had a header that said: Thought for 19 seconds. I expanded the header, and saw the thought-process it used, much like o1 does. I never encountered this, and I use GPT-4o extensively.
I did add a screenshot of it in my original post.

I would love to share the link to the chat, unfortunately I confronted it with the fact that it used o1-reasoning and posted a screenshot in chat - sharing of chats that have images in them is apparently not supported as of now.

Edit: spelling.

GrandPrixTournament · November 11, 2024, 12:54pm

Yes, I am aware of the sharing-feature. The chat was in GPT-4o, not o1. When I try to share it, it tells me sharing of chats with images is not supported.

EricGT · November 11, 2024, 12:55pm

Thanks.

My bad.

I realized after re-reading your post I read that part wrong and deleted my post.

vb · November 11, 2024, 1:00pm

Yes, I have had the same experience. In my case, it was with proofreading short texts. I always preferred the o1-style option that only corrected mistakes, as it produced more straightforward results without additional commentary. It follows instructions to simply correct errors rather than refine the text and it doesn’t add any quotation marks.

GrandPrixTournament · November 11, 2024, 1:13pm

Interesting! Did this also happen recently, or is it something they tested for a longer time now, that eluded me for some reason?
Apparently they try to gather data on how responses between 4o and o1 compare and are received by users, possibly to improve responses of future models.

Edit: Or to train the models to decide when to use reflective reasoning, and in what cases a “normal” answer might be sufficient, which would allow for a more tailored, efficient response generation?

EricGT · November 11, 2024, 1:20pm

I too have seen more than my share of these comparisons.

This has been going on with ChatGPT for as long as I can remember. I started using ChatGPT about two months after it debuted and it seemed much more prevalent back then. I even remember a time when it was the norm rather than the expectation.

The technology is RLHF ( Reinforcement learning from human feedback) for those interested.

GrandPrixTournament · November 11, 2024, 1:23pm

It’s less about the comparisons themselves that stands out though, and more about the fact that they provide 2 answers in a GPT-4o chat, one of which was generated with GPT-4o, the other with o1. That’s the novelty I noticed, not the providing of two answers to choose from.

vb · November 11, 2024, 1:33pm

I must have first seen it about a month ago. However, with ChatGPT, there’s a lot more going on. For example, this data could be used to fine-tune the 4o model’s response style, rather than suggesting that o1 will replace 4o anytime soon. I think that’s also what Eric is hinting at.

Jewelie · March 19, 2025, 2:46am

Anyone else seeing this happen again?

I got one last week (with no A/B testing) but dismissed it as maybe a bug.

I got one again today (Response 1 vs Response 2 - where 1 was the reasoning model)

The chat is short and innocuous enough to share but as a new user it seems it won’t let me post the link.

Topic		Replies	Views
In my chatgpt suddenly o3 Mini appeared and it started to reason Community gpt-4	7	1085	February 13, 2025
Seen anything novel by o1-preview? Community o1-preview	15	1997	September 16, 2024
Hypothetical Token-increase Strategy . Community gpt-4 , chatgpt	21	371	March 17, 2025
Comparing GPT-4o and O3-Mini on same task Prompting chatgpt , gpt-4o , o3-mini	2	1324	March 14, 2025
Any other Pro users using o1 for math? Community chatgpt , o1 , o1-pro-mode	9	1147	January 24, 2025

GPT-4o testing o1-reasoning

Related topics