OpenAI ChatGPT + Robot = Figure 01

PaulBellow · March 13, 2024, 7:05pm

Whoa…

@Figure_robot

With OpenAI, Figure 01 can now have full conversations with people -OpenAI models provide high-level visual and language intelligence -Figure neural networks deliver fast, low-level, dexterous robot actions Everything in this video is a neural network:

Diet · March 13, 2024, 7:10pm

Programmed to stutter

nice

I think

PaulBellow · March 13, 2024, 7:13pm

I noticed that! Very uncanny valley!

It was a very “natural” sounding voice… Just all the pieces coming together…

darcschnider · March 13, 2024, 7:14pm

the renders look nice, what was used to generate the videos?

PaulBellow · March 13, 2024, 7:36pm

Which ones? Not sure…

SomeUser2022 · March 13, 2024, 10:20pm

Pause words, sneaky way to buy it a little more, um, what do you call it, uh, inference time

And I wonder if that gravely voice is a way to disguise a speech generation model optimized for speed over quality

PaulBellow · March 13, 2024, 10:45pm

Heh… stream (Uh-huh) stream …

I’m hearing a lot of non-tech people saying the voice is faked…

Another take on YT…

DavidMazur · March 14, 2024, 5:04am

This is pretty cool, but I don’t think the base AI isn’t anything different from what has been released as an API. The stutter/filler words serve to allow for quicker responses by chunking the TTS like model output. OpenAI’s current TTS sends only file output which I assume is a limitation of the model. I’ve overcome this in the past by chunking it at the most natural punctuation. The filler words may be some very light LLM like model that contextualizes the filler words or just random filler word selection (although less clean). The communication goes through a Whisper Like Model->Base LLM-> TTS like model chain. To control the movements, they are probably using an LLM agent like framework that can specifically task plan and access the different parts of the robot as tools. Pretty cool

zingrevenue · March 14, 2024, 11:11am

It does look like we’re living in an age where first impressions pummel logic to death.

trenton.dambrowitz · March 14, 2024, 11:40am

It sounds just like the voice from the ChatGPT App?

I’m questioning if the movements were pre-programmed at all or if the LLM genuinely instructed it on what to do…

darcschnider · March 14, 2024, 11:57am

If I was to build a robot it would not be completely GPT driven.
I would have an automation controller working with GPT so that the sensor systems and automation for basic movements are real-time but using AI processing along with understanding to build a system where the GPT can instruct the automation for various tasks. Biggest issue with GPT is time delay which in anything moving needs to see in real-time to react in real-time. you can try to build complex math for predictive movements which well can be tuned pretty good I am not convinced that its fast enough to ensure safety around it making it limited in application which is still ok if that is what you are aiming for.

now if that is a real product it is impressive and look forward to ai speeds that can handle real-time

jgerritse136010 · March 14, 2024, 11:34pm

PLEASE do a demo video of asking it to do random things such as “lay that cup on it’s side” or “drop the plate mid-hold.” That would be incredible! I think we all might like to see it attempt to perform tasks that are unconventional so that we may gain further insights to it’s true capabilities.

wolfman.erin.james · March 15, 2024, 12:01am

This is very cool. Can we NOT make them out of metal though? You know, just in case of a robot uprising. Maybe make them out of recycled plastic, it’ll keep costs down, and it will be good for the planet?

PaulBellow · March 15, 2024, 12:16am

Here’s the image of a robot designed to be as safe and non-threatening as possible, made entirely out of bubble-wrap. It stands in a friendly pose, suggesting openness and helpfulness, with a welcoming expression on its face.

wolfman.erin.james · March 15, 2024, 12:41am

Perfect. It can also be easily shipped.

PaulBellow · March 15, 2024, 12:44am

Welcome to the site, by the way.

Hope you stick around. We’ve got a great AI dev community growing here.

loseth · March 17, 2024, 1:13pm

I would have expected that 1x.tech would be the one to demonstrate this level of capability first, given that OpenAI and Microsoft has put a substantial amount of money into that company.
There is a big fight for funding and talent, so it matters greatly who is able to publicly demonstrate proper progress.

karenandrews06 · March 18, 2024, 9:32pm

The neural autonomous learning is amazing and the dexterity feels like it’s crossed a developmental rubicon. The question is what becomes of humans…?

SomeUser2022 · March 19, 2024, 12:19am

Developmental rubicon… interesting choice of words
I agree, to me the training across many actuators working in concert, is more impressive than the voice and vision integration.
As for humans, I won’t be so sad to see humans go, I don’t think our brains were designed for this kind of society. But what do you think?

oshea00 · March 19, 2024, 9:28pm

I’ve had some experience with robotics and kinematics. A lot of wow factor here. One thing I noticed is the way the set was pre-staged. It might have been even more effective if we had seen the human demonstrator place the objects in front of the robot by sliding them over to it. This would dispel the notion that the movements were all pre-calculated.

Topic		Replies	Views
Official chat gpt figurine wanted with mic and speaker Community chatgpt	22	2105	January 20, 2024
Did OpenAI just make a new AI Voice? API	7	3016	May 16, 2024
OpenAI Cofounder John Schulman Interview: Reasoning, RLHF, & Plan for 2027 AGI Community chatgpt , in-the-news	6	1857	May 16, 2024
Avatarized myself 🤖 Community	19	1636	March 17, 2024
Sounds great but I have my doubts Community gpt-4 , chatgpt , dalle3	2	657	November 15, 2023

OpenAI ChatGPT + Robot = Figure 01

Related topics