It is not a “q-star” whatever, like the nonsense video headline would have you believe.
This is from Stanford.
Star = Self-Taught Reasoner (multi-step agent)
Quiet, because like you’ve always been able to do, even in an OpenAI cookbook, only the final result is presented.