What is Q*? And when we will hear more?

But it can’t do this. A variation of the example from ChatGPT’s announcement blog ( :tada: one year ago :tada:) with a “before” model.

Sam

What was a joke idea, instead “wow, that’s an insightful AI they’ll be turning off”…

with reflection and FTP (future time perspective) being the core of the Q-learning algo. That means it’s not just thinking about which responses get it the most reward right now, but what those actions might return much farther down the timeline.

The hype comes directly from Sam Altman, who had this to say, the day before he was fired:
“I think this is going to be the greatest leap forward that we’ve had yet so far, and the greatest leap forward of any of the big technological revolutions we’ve had so far. so i’m super excited, i can’t imagine anything more exciting to work on. and on a personal note, like four times now in the history of openai, the most recent time was just in the last couple of weeks, i’ve gotten to be in the room when we sort of like push the front, this sort of the veil of ignorance back and the frontier of discovery forward. and getting to do that is like the professional honor of a lifetime. so that’s just, it’s so fun to get to work on that.”

Finally, when asked what surprises may be announced by the company next year, Sam had this to say

“The model capability will have taken such a leap forward that no one expected.”

  • “Wait, say it again?”
    “The model capability, like what these systems can do, will have taken such a leap forward, no one expected that much progress.”
  • "And why is that a remarkable thing? Why is it brilliant? "
    "Well, it’s just different to expectation. I think people have in their mind how much better the model will be next year, and it’ll be remarkable how much different it is. "
  • “That’s intriguing.”

It certainly is. But the Q* stuff is just an invented conspiracy theory fueled by this intrigue.

The reports about the Q model breakthrough that you all recently made, what’s going on there?

Altman: No particular comment on that unfortunate leak. But what we have been saying —

Could be misdirection or maybe not, who knows!

For my money though, I really had fun going down the rabbit hole and enjoyed reading everyone’s perspectives on qlearning.

Not everything has to be a breakthrough to be interesting.

3 Likes

So basically Sam has confirmed the existence of Q* and the leak of its information.

Good news. GPT-5 looks promising!

I don’t see AGI as threat since LLM can’t reason and does not think at rest.
All it does take some input and produce matching output based on statistics.
Giving human qualities to LLM is mistake people are making.

Yeah, he certainly confirmed that Q* was an “unfortunate leak”.

But not sure what to read into that.

It could be unfortunate since it’s not ready for production and still needs more development. So it’s hype at this point.

I don’t think it will have an impact on GPT-5 since they are already started training that.

2 Likes

So you think a statistical model that was only trained on text should be able to somehow write a python program to draw a picture of a unicorn, even though it shouldn’t have any concept of 2-dimensional space at all? That happened, I believe. Not sure if it was just one researcher’s claim or if it’s easily replicated by other researchers.

Seems like we’re seeing enough emergent capabilities that I think we can safely say something more complex than ordinary statistics is happening.

I’m not saying we’ve created a life-form or consciousness or anything silly like that, but we’re definitely seeing “true reasoning” and I think we may end up someday finding out that every quantum mechanical wave collapse that happens in systems with high enough negentropy tends to land on (via QM probabilities) the side of self-preservation rather than self-destruction, and so we’re seeing come kind of bizarre evidence confirming the Many Worlds interpenetration decoherence along with the anthropic principle that those are the most likely universes we’ll find ourselves in.

Sure I just made a wild guess right there, but I think the truth will be every bit as bizarre as what I just said, once we understand it all. I don’t buy the “it’s just statistics” explanation for all the astounding emergent capabilities.

What puzzle me is that you can’t teach algorithm lets say to add 2 numbers together to LLM and it can execute it perfectly. That should be a child play for AGI. Emergent capabilities sounds like a myth and sales brochure for investors.

You’re intelligent, and yet I’ll bet you’ll use tools that are external to your brain to add two large numbers…

Transformer large language models generate text in one direction, and numbers are also abstracted by tokens where the AI will have learned that large numbers use multi-character sequences.

Here is the true “consciousness and self-reflection” when an AI does math, writing the second token of an answer:

image

Interesting is that the selected tokens are semantically similar to “374”, but a correct token is not seen.


offtopic edit: I did a bit more experimenting, and what is interesting is that when you upgrade this task to GPT-4, its probable MoE architecture has the ability to answer correctly with an “expert”, but when broken down to a method I think should be more architecture-oriented and applicable to arbitrary length digits, with 1-shot example, it loses the challenge at the second digit it should add in step 2.

4 Likes

Yann LeCun on X [Twitter]:

LLMs produce their answers with a fixed amount of computation per token.
There is no way for them to devote more (potentially unlimited) time and effort to solving difficult problems. This is very much akin to the human fast and subconscious “System 1” decision process.

True reasoning and planning would allow the system to search for a solution, using a potentially unlimited unlimited time for it. This iterative inference process is more akin to the human deliberate and conscious “System 2”.
This is what allows humans and many animals to find new solutions to new problems in new situations.

Some AI systems have planning abilities, namely those that play games or control robots. Game playing AI systems such as AlphaGo, AlphaZero, Libratus (poker), and Cicero (Diplomacy) have planning abilities. These systems are still fairly limited in their planning abilities, compared to animals and humans.

To have more general planning abilities, an AI system would need to possess a world model, i.e. a subsystem that can predict the consequences of an action sequence: given the state of the world at time t, and an imagined action I could take, what would be the set of plausible states of the world at time t+1.
With a world model, a system can plan a sequence of action so as to fulfill an objective.

How to build and train such world models is still a largely unsolved problem.
Even more complex is how to decompose a complex objective into a sequence of sub-objectives. This would enable hierarchical planning, something that humans and many animals can do effortlessly but is still completely out of reach of AI systems.

5 Likes

What is the source of the information that GPT-5 is already being trained?

It was posted on Twitter last week or so from OAI, saying “GPT-5 is about to go into the oven”, or something like that. I wish I could find the tweet.

There’s also such hints as a new recent appeal for knowledge experts to join AI red-teaming, the last such recruitment being early 2022.

In a sea of speculations, q is all that one can hold onto at the moment.

Fact that GPT-4 has gone turbo :slight_smile:

This Q speculation, whether warranted or not is brilliant. It’s great for news outlets, great for OAI, and great for youtubers. More layers? What was needed was more intrigue! “We can neither confirm nor deny the existence of any particularities about the alleged but not confirmed model but were the alleged model to exist, it could, at least in theory, have shown another key step towards AGI. But that’s all speculation and we simply cannot say more. Any further speculation would be reckless, but come back tomorrow for more”

Someone should print up some T-shirts and swag “Who’s a star? Q*”

The parametric knowledge in the model weights of LLMs might be responsible for a “Simple Emergence” (like the characters and behaviors exhibited in Conway’s Game of Life for example, which are commonly used to demonstrate the definition of a simple emergence)…

or

It might be that these weights may be doing what I described and have a far deeper and mysterious cause for the “reasoning” that LLMs are capable of. But from what I’ve seen already I’m fairly confident something far different from ordinary simple emergence is happening.

It’s finally December, and according to Siqi Chen, CEO of Runway and an AI investor, GPT-5 is scheduled for release in December. I am wondering if this rumor has anything to do with Q*.

We need a new Turing Test that doesn’t ask whether we can distinguish an AI from a human, but whether it’s like anything to be that AI, and whether and how we can tell.
Nothing much may hang on the answer from a computational point of view, but if we are disinclined to acknowledge supernatural influences, anything that arises in an AGI or a human is just a product of natural emergence including our sense of who we are and my sense of what it is like to be me.
That human engineers will have played a role in evoking AGIs when they arise is just a second-level natural process; that an AGI may play a role in engineering its successors is just another. That either or both may prove catastrophic for humanity is just a feature of the cosmic process: we don’t have a God-given right to be here; and we never enjoyed the primacy we have enjoyed other than as a piece of happenstance.
The Q-star speculation, however well- or badly-founded, at least seems to show that we already have a pretty good idea of what AGI is going to look like and where it is likely to emerge from. The current debate is about how long it will take to get there.
Later on the AGI can amuse itself by devising a third Turing Test by which to decide whether it thinks it’s like anything to be “us”.

2 Likes