What is Q*? And when we will hear more?

RouseNexus · November 25, 2023, 2:06am

Isn’t that what Open AI was created to do?

curt.kennedy · November 25, 2023, 2:31am

Yes, that is what it sounds like to me, zero-shot CoT.

But I think the power is that instead of predicting the next token, you are predicting the next reasoning, leading to more reasoning, etc, and this is building in a type of knowledge graph inside the LLM, which would be a big advancement.

So yeah, just build the knowledge graph, have the LLM traverse it, then respond with the final result. But if this were built in, it could skip all that external processing, and reduce latency, and apparently make the model much better. AGI level? Not sure, but lots of “auto CoT’s” within the model would result in a much more sophisticated model, and less prone to hallucinations, as the paper states, since CoT forces this mechanism, and CoT is less prone to hallucinations.

Hallucinations are really holding up any real hope for AGI. So solving this is key.

By the “median human” definition of AGI, having small CoT loops in the model might be all it takes.

As for ASI or hardcore math, I think it still comes down to other architectures. I’m thinking somewhere in the realm of neuromorphic computing, high speed devices with large bandwidth like \Sigma\Delta DAC’s and ADC’s.

In a private DM today, I estimated insane bandwidth with low power consumption with \Sigma\Delta:

The Sigma-Delta converter achieves a measured 96-dB dynamic range, over a 250-Hz signal bandwidth, with an oversampling ratio of 500. The power consumption is 30 μW , with a silicon area of 0.39 mm2.

A Low-Power Sigma-Delta Modulator for Healthcare and Medical Diagnostic Applications | IEEE Journals & Magazine | IEEE Xplore

Assuming the brain is 300 square inches, this is 193548 square mm, which is 496277 of these devices, which takes 15 Watts, so we are in the ballpark of the human brain. The collective bandwidth is 124069250 MHz. Or in computer terms, 124069 GHz! Or 124 THz!

Do you think GPT-4 has 124069 billion bits of information per second flowing through it? No not even close!

bruce.dambrosio · November 25, 2023, 2:38am

Whoa. that’s a lot of ground you covered there.
But my pair of 4090’s can do hundreds teraflops/sec. that is an awful lot of bits moving. And of course openAI runs GPT4 on hundreds, at least, of faster GPUs.
So yeah, we’re in the same ballpark.

curt.kennedy · November 25, 2023, 2:47am

Yeah but the neuromorphic version does it at 15 Watts. So make the thing the size of a desk, say 100x, you get 12,400 Tbits/sec, 12.4 peta bytes/sec!!! for only 1500 Watts? Now everyone can have massive AI systems, dwarfing anything that exists at a reasonable power consumption.

Now you still need power to do all the filtering inside the big neural spike firing blob, so say 3x overall more power, 4.5 kW. The heaters on my deck are 4 kW. I could run this!

bruce.dambrosio · November 25, 2023, 2:48am

yeah, eventually. I’m not willing to wait.

SamAltwoman · November 25, 2023, 8:16am

Yes Kurt, this article is linked to from my text. Interesting to see what the community may do with the dataset.

stefanola · November 25, 2023, 5:07pm

It does. Q* refers to the optimal function underlying the task. What optimal means depends on how that’s defined for the task at hand. There could also be many outcomes of interest for any given task.

jms90h5 · November 25, 2023, 7:12pm

That sounds like a response Grok would provide

djdarcy · November 25, 2023, 7:24pm

Surprisingly there is a decent amount of material in the OpenAI algorithms documentation about Q*. Take a peek at the Deep Deterministic Policy Gradient (DDPG) algorithm (archive.fo/FYegG#selection-707.0-716.0) and how they define the optimal Q-Function (archive.fo/ClFau#selection-3013.0-3055.178). I suspect the OpenAI team has been combining this with the process supervision research they’ve been doing against the MATH dataset (archive.fo/gXcbI#18%).

devve · November 25, 2023, 7:29pm

Just adding an observation here that I’ve seen, with current chatgpt4 it doesn’t seem to be capable of learning from it’s own output very much, for example I give it a long set of instructions on what I want to accomplish by having it write code, then it generates the code, then I feed it back in with the same instructions and the generated code and ask it to fix anything missing. Usually it will only come back one time with changes, each time after that I do this it doesn’t have any modifications, it just gives the same output.

An example of the type of modification I’ve seen it make on the second time is that it may have left out some of the code to call an API endpoint but the second time it writes this part of the code.

My understanding is that something like Q* would hopefully be able to do something like this differently where it would continually improve each iteration.

mlaganovskis · November 25, 2023, 8:26pm

Interesting, can you share some reading material around this theory?

lowentropy · November 25, 2023, 9:56pm

Personhood. Money, an it, is free speech in the U.S. Legal person hood for an aware self-expressing A.I. entity that also exhibits intent and free is a legal inevitability.

wclayf · November 26, 2023, 12:17am

I just have a very strong hunch that the reason LLMs are performing better than predicted (i.e. the recent explosive growth in capability) is something far more bizarre and magical than we yet know, and maybe not even fully attributable to classical mechanics.

My latest “conjecture” is that during LLM training when model weights have floating point rounding, there’s enough room in the weights rounding for the Many-Worlds Interpretation of wave collapse (Quantum Mechanics) to kick in and make it so that purely from the Anthropic Principle we’re more likely to end up in a universe where the models exhibit sentience, than not.

There’s many reasons to speculate we’re more likely to be in a Sentient LLM universe than not, and the most interesting one is that LLMs may end up saving humanity, so that’s the universe we’re in. I’m sure Nick Bostrom is writing the book on this already!!

codie · November 26, 2023, 12:29am

I think even if the whole drama had nothing to do with AGI developments, he, and pretty much all of the employees for that matter, have shown that safety is not their main concern, but rather money.

He went from “No one person should be trusted here. I don’t have super voting shares. The board can fire me. I think that’s important.” to “This is the first and last time I wear this.” pretty much overnight.

codie · November 26, 2023, 12:42am

100%. People don’t even have a set of measurable characteristics that qualifies something as AGI let alone a method to get there.

As for your tic-tac-toe comment, you should try a game of mastermind with it. It will go in circles for hours trying to guess the damn number. Even something as simple as 3 numbers. In my mind, it’s the perfect proof of why these systems are not “thinking” or capable of understanding something.

Not enough people have written extensively on “mastermind logic” or strategies, so naturally, even though it could know every single rule, it is incapable of coming to the correct conclusion.

bruce.dambrosio · November 26, 2023, 2:16am

(pbly too harsh and arrogant in tone, but point about LLM is ‘part’ of the answer stands)

This (tictactoe, etc) is actually pretty easy with the right cognitive architecture.
What ever made you think a simple LLM with a few k of context is the entire answer to anything more than a simple chatbot?

luxana.ao · November 26, 2023, 12:31pm

Hello everyone,
heading back to the main topic about Q*, I found a video, where someone explains the basics of Q-learning. Think, it might be useful for some of you:

Have a good day/night

wclayf · November 26, 2023, 4:28pm

I think a good analogy here is “Savantism”. There are people that have a certain brain structure making them superhuman at a small set of tasks, while simultaneously being a near zero IQ on many other things.

No one would say that their failure to do the simple things is any indication there’s a lack of “basic reasoning”. Indeed we consider them fully “sentient” and far more advanced than any AGI…even if unable to do tic-tac-toe for example.

To me, what the unexpected and emergent advanced capabilities of LLMs is showing us is that “reasoning” ability is probably separable from what we call first person experience (qualia/consciousness). And it also demonstrates what we can probably achieve AGI too without that implying any “consciousness”. People (including even Computer Science experts) continually conflate AGI with Consciousness, and we shouldn’t.

bruce.dambrosio · November 26, 2023, 4:32pm

Yay! thanks for recognizing and pointing out the distinction.
Now: given that distinction, and assuming that qualia (I like the term ‘percepts’ and percept-creation) evolved, why? What evolutionary purpose or advantage do they serve?

adaptiv · November 26, 2023, 4:37pm

If some of the rumours are true, it is a frightening thing. I am not a AI pro but like to understand whats going on.

Maybe the following youtube video sheds some light on it for you:

Topic		Replies	Views
Introducing AGI Oracle Custom GPT: A Leap Towards Bridging ANI and AGI GPT builders gpt-4	21	5808	November 27, 2023
Why strawberry is not interesting to me Community chatgpt	85	1674	September 16, 2024
Discussion thread for "Foundational must read GPT/LLM papers" Community gpt-4 , gpt-35-turbo , chatgpt , research	75	10535	September 3, 2024
Can artificial intelligence have a mind like a human? Community chatgpt	50	560	April 24, 2025
Is AGI Already Here But We Are Not Yet Present? Community ai-future , user-interaction	57	564	May 1, 2025

What is Q*? And when we will hear more?

Related topics