they’re obviously not going to be releasing a model where each query is $2k. They’ve worked out how to unlock higher intelligence at a fairly dramatic level and now the focus will be on getting this to work at a lower cost point. A big part of that will probably be some form of fine-tuning based on complete model output - where a less compute-intensive model can learn how to reduce down the evaluation space, rather than using a full evaluation space like the complete model*.
- My understanding is that the new model is essentially generating hundreds or thousands of o1 style answers and then using a next step to evaluate the best answer (and maybe a further set of iterations). So the core cost is the number of iterative answers being generated.