I appreciate the thorough backgrounding on the economics of large models with large user volume.
It is a catch-22 --more smarter AI = more expensive inference time.
The industry will sort it out. Back-propagation as self-tuning is yet not a competive survival skill.
The present market is smoke and mirrors so long as it is still burning investor capital.
There will be a level of buoyancy where the need for smarter AI can float the cost of more-smarter inference time.
Also, the market evolutionary pressure is not so much based on effectivity of the smarter-output, more-effective-product --at the present.
The market is presently based on investor burn confidence. Were a quantum processing breakthrough to turn carbon paper into a supercomputer for Pennies, the investors would stop burning capital to out-compete with inference verses with sudden quantum answers --but with the same algorithms… more faster, same algorithms.
This delineates two research frontiers, 1) How to do more on the cheap, and 2) How to do it differently. Different algorithms.
E.g.: P. S. Prueitt proposed a mind model based in category theory in 1996, which modeled ‘understanding’ via multiple ‘knowledge’ graphs, which KB is a subset of a the model of understanding.
Five years later P.S.P’s mind model disappeared down the DARPA rabbit hole… wrapped in (now the former) Soviet intelligence community cross-time/cross-channel category theory driven by random selections. Citing experience.
Were a military grade AI turned loose on desktops, the entire industry could turn inside out, and the commodity could be in the royalty on the well-trained-core. Rather than a nuclear power plant to a multi-billion dollar cash flow locally with no benefit to the locality by much (once fabricated).
My money is on nuclear power plants. My hope is for a future that leads us to the stars. My outlook for that? Meh.
Crystal ball report OFF