I find that all almost ChatGPT models now being named something with a small “o” in the name and a number with a long vertical line (e.g. 1 or 4) made them quite difficult to distinguish at first glance. I constantly accidentally hit 4o-mini when I wanted o1-mini.
Please remember that the point of naming is descriptiveness, not marketing gimmickery.
I would just number them consecutively, not restarting the count arbitrarily, and not having an “o” in every model name. Functional distinctions like distilled models can still be distinguished with terms like mini.
well, to be fair, no one knows if gpt-5 will be released. I think it’s just a different kind of model. tts models are for text to speech, currenlty it is at tts-1 for example. They didn’t restart the count, they created a new type of model.
back in the day, gpt-2 and gpt-3 days, it used to be text-davinci-002 and text-davinci-001 text-ada-002 and they eventually changed to gpt-3.5
so what you are saying is that you think naming it o1 is bad and it should’ve been perhaps reason-1 instead of o1, since that gets confused with gpt-4o - and I get that, but I think it’ll be hard for them to change it now.
you can always abstract the names away in many different ways… create aliases, keep the names on a different file, create a variable such as called reason = “o1-preview”… and so on… I do suspect they started using o1 for marketing, or because “o” means quantization for them or because in the future there are plans to buy o.com and replace chatgpt… maybe marketing team got to name it and the dev team fought it. regardless, I’m sorry you are going through this and I hope they take this topic in consideration for future model naming.
This is what I do… I use multiple models from different providers and have to surface these to users who are aren’t technical so I came up with my own proprietary naming scheme:
good = gpt-4o-mini
better = gpt-4o
best = o1-preview
I can’t be 100% certain users understand my naming scheme and the difference in the models but I haven’t heard any complaints yet.
I never really use o1-preview, because it is always wrong about everything. It can’t code correctly and it filters history. I hadn’t considered it “best”. I just use if it I ran out of o1-mini allocation, which works some of the time for specific coding questions.
My scenarios often involve needing to reason over complex data and o1-preview is really good at that.
It’s worth noting that for me both “better” and “best” are always paired with gpt-4o-mini. Mini pre-screens all of the information and then I use either gpt-4o or o1-preview to rework the conclusions of mini. For large RAG like corpora this gives you gpt-4o or o1-preview quality answers for significantly reduced costs. With o1-preview I can perform large RAG like tasks for under $2 per million tokens and in about 1/20th the time it would take o1 to process the same amount of information. If anything the answers are better and not worse…
I’m doing essentially the LLM equivalent of upscaling.