What is Q*? And when we will hear more?

yurkomik · November 23, 2023, 8:02am

What if it’s Q from Star Trek? Q (Star Trek) - Wikipedia

It was unrestricted superintelligent type that almost destroid, tested, made fun off and educated human race.

eelkonio · November 23, 2023, 9:34am

I love the guy with the headset in this picture. #woops

Foxalabs · November 23, 2023, 9:39am

N2U · November 23, 2023, 9:40am

My best guess is that this is something to improve the logical and mathematical reasoning in the models based on Q learning

The “star” could imply some relation to the A* algorithm as previously mentioned.

N2U · November 23, 2023, 9:43am

It would be really great if someone told OpenAI that posting secrets in public can end up sideways

DanieleG · November 23, 2023, 9:46am

I mean, there’s so much confusion on what AGI, consciousness, and all other concepts that get bothered when stuff like this gets discussed. I mean, maybe Wittgenstein was right on “Whereof one can not speak [i.e. sensibly], thereof one must be silent”. Announcements like these in Reuters though mostly make me think about whether any research should follow corporate logic, as other people here said as well. (not just ML research)

In general, I think with GPT4 there’s already plenty to study about emergent abilities / in-context learning without bothering new secretive projects. I have always imagined that generalised super-human performance (again, with Wittgenstein, whatever that means, so I guess oops) would be achieved with some flavour of neuro-symbolic integration and maybe a very simple architecture.

ruibjr · November 23, 2023, 11:23am

“Q” is a type of reinforcement-learning.
A* is an algorithm that is used in game development, to find the shortes/least costly path between a character and its goal/position.
Add “Q” and “A*” and you get “Q*”.
This could mean something.

_j · November 23, 2023, 11:32am

The joke there was conspiracy rubes finding random things in ML literature to latch on to. PPO is the ML algo that OpenAI already fine tunes models with in production.

Troll apparel to wear when visiting OpenAI

qrdl · November 23, 2023, 11:36am

https://spinningup.openai.com/en/latest/algorithms/ddpg.html

Deep Deterministic Policy Gradient (DDPG) is an algorithm which concurrently learns a Q-function and a policy. It uses off-policy data and the Bellman equation to learn the Q-function, and uses the Q-function to learn the policy.

This approach is closely connected to Q-learning, and is motivated the same way: if you know the optimal action-value function $Q^*(s,a)$ , then in any given state, the optimal action $a^*(s)$ can be found by solving

N2U · November 23, 2023, 11:51am

Great find mate, thanks for sharing!

I think you found it

qrdl · November 23, 2023, 12:22pm

All credit to https://twitter.com/npew/status/1727595470795792489

I would ask folks at OpenAI team to dial down the public mocking if you’re seeing results with q*, even if they aren’t breakthroughs (which very likely aren’t)

It’s bad enough you have stopped publishing while continuing to take advantage of open research.

RouseNexus · November 23, 2023, 1:11pm

I’ve been worried about this for a while. The fractures in humanity are a huge weakness. I’m less worried about autonomous AI than I am madmen abusing it. This incredible powerful tool changes the game. AI is not dangerous to people, other people are. I started a youtube channel on this very subject. It’s less than 2 week old, so it’s tough getting the word out.

Browsergpts.com · November 23, 2023, 1:19pm

Can you send me your YouTube channel, please? I’m really curious and would love to check it out!

RouseNexus · November 23, 2023, 1:22pm

They will not let me post a lonk. I just uploaded into my profile. It ahows up witha search for Rouse Nexus- The AI Metaphysic. Let me know what you think.

phillip2 · November 23, 2023, 1:36pm

You have no clue what you are talking about. LLMs doing math correctly is a huge breakthrough, there is not a single model doing this reliable.
A model with can do math can break down a lot of problems more rationally and needs the ability to plan where its going. If would also enable the model to generalize and not just repeat what’s in the training data.
This is the first step to make agents work on way more complex tasks. The model having the ability to evaluate the reward of different plans to achieve the task and then execute on them is what held Auto-gpt back.

Rainey107577 · November 23, 2023, 1:46pm

Lets try to please keep the text civil when you send it. I get the passions we all have and understand completely the reaction of seeing counter arguments. I ask that we check the text and edit our words accordingly to remain civil and expansive rather than retractive. Please and thanks either way.

WeylandLabs · November 23, 2023, 2:28pm

I was being sarcastic, calm down Phil. lol

hurst · November 23, 2023, 2:33pm

Q* has leaned to code in DNA The most efficient and dense programming language is actually the one in every living creature. Q* can now be used to code and store data in organic materials. No machine or electricity required.

N2U · November 23, 2023, 2:48pm

Do you know how “coding in DNA” is actually done?

tltanhueco.1990 · November 23, 2023, 2:50pm

I hope Q* LLM gets applied to ChatGPT sometime!

Topic		Replies	Views
Why strawberry is not interesting to me Community chatgpt	85	1843	September 16, 2024
What is the impact of DeepSeek on the AI sector? 🔥 Community o1	166	9296	February 16, 2025
Ethics of AI - Put your ethical concerns here Community ethics	410	4481	August 20, 2025
Introducing AGI Oracle Custom GPT: A Leap Towards Bridging ANI and AGI GPT builders gpt-4	21	5918	November 27, 2023
I'm working on a consciousness engine Community	48	4821	January 30, 2024

What is Q*? And when we will hear more?

Related topics