I always wonder why GPT4o is more powerful than GPT-4-turbo but the price is much cheaper? Does that mean models will get cheaper as they get more powerful? Doesn’t seem very logical
4 reasons,
- Better ideas. As they have been building these models they learn what works, what doesn’t, and how to work more efficiently. By incorporating better design ideas into the architecture of the model, they will be able to get better results with the same number of parameters or similar results with fewer parameters. They can also identify other efficiencies to reduce model size or RAM requirements. The effect of this is that newer models can pump out more tokens in less time using less electricity. So the models can be cheaper.
- Better technology. They have been rapidly deploying newer, more efficient GPUs as quickly as they can get them. These newer GPUs result in much greater throughput, particularly for inference tasks. The effect of this is that newer models can pump out more tokens in less time using less electricity. So the models can be cheaper.
- Better data. As they collect more and more user data and determine what is working and what isn’t they can curate that data so it contains a much better signal-to-noise ratio for the tasks they want the model to perform well at. Having better, cleaner data means they need fewer parameters to give the model the capabilities they want it to have. The effect of this is that newer models can pump out more tokens in less time using less electricity. So the models can be cheaper.
- They’re not better at everything. They are definitely prioritizing the capabilities of the models which people are using the most. By doing this they can get most people to experience generally better results. So the models will (at least appear) stronger, even if they become weaker in some areas. By putting more-wood-behind-fewer-arrows so to speak, they can reduce the overall model size without sacrificing much (if any) quality for the vast majority of users and use cases. The effect of this is… you can probably guess by now.
GPTs like GPT4o and all Gen AI run on mathematical calculations which calculate the responses. These mathematical calculations are run on hardware servers which have processing units called GPUs and they also use memory etc. They are huge ‘computers’ stacked together to do the maths to calculate AI responses. These servers use Power and there are some other costs associated with running them.
Sometimes a better algorithm in software will mean that a better model will use less computational capacity and less memory. So a software level improvement results in efficiency in hardware usage. An increase in efficiency leads to a more powerful model but using less computational power. This translates into better performance but cheaper consumer price.
It could be an incentive to utilize newer versions because potential bugs exist in the older ones. Since they still exist, in a relatively pure form, it stands to reason newer versions are made more attractive by price and advancement for a reason more related to redirecting focus than efficiencies in a time of extreme inflation making access more affordable. Competition may play a role, but that hasn’t quite been a factor yet.