So Muse - Embodied Personal AI that inspires

Hey there :v: I’m Louie and i just finished my masters in product-design in Berlin. In my graduation project I worked on Human/AI Interaction focusing mainly on natural language.
Therefore I built a prototype in form of a personal robot that assists and guides you through certain steps of a creative process (on a conceptual level). It listens to your ideas, asks questions or even comes up with its own ideas. The main goal of the interaction is not to get the perfect answers by the AI but to get answers that are so “out of the box” that they inspire.

For the robot I basically connected the OpenAI API with a Speech Recognition and a Speech Synthesis API that constantly listen for the wake word. After it detected the wake word a loop starts where it listens for an utterance that gets sent to OpenAI where the NLU and NLG happens. The text then gets converted to an mp3 file with the synthesized speech and will be played back. Then the loop starts again.

My main focus was on testing different use-cases with this kind of interface so the basic chat function of GPT-3 was fine for this. I realized quickly that there is a lot of potential in different areas but after a few minutes of conversation the AI often got in weird directions. It was still really interesting to look behind the curtain and see how the expectations of the human and the behaviour of the AI diverged. (Especially when the Speech Recognition understood something completely different but the human does not know.)

Now I would be really interested in picking a very narrow use-case and trying to fine-tune the model so that it actually can have conversations with specific knowledge of a field.
Even if I started with general brainstormings on conceptual ideas some of the most interesing conversations I saw went into a medical direction, where the human had some issues that they couldn’t sleep or had a headache. In these cases the robot came up with some practical tips like holding one nostril and take in three deep breaths while focusing on your forehead. Without knowing if this is a proven method of treating headache it was fascinating to see how the person followed the instructions and claimed to feel better afterwards.

Imagining that there would be a fine-tuned model that is i.e. especially good in giving tips for stress reduction could be a useful application. For now I would like to leave aside the hardware part of my project and concentrate on exploring different use-cases with a GUI and maybe a VUI.

If you have any ideas or comments what you would be exited about to see, I would be very happy to start a discussion. Also I’m really interested in the possibilities that the recently announced fine-tuning possibilities bring and would be more than happy to get some technical feedback if this could make sense for my ideas at all :slight_smile:

Here is the project on my website and two links to videos of my robot in action:

Project Overview:

Product Video:

Tesing Video:

Best, Louie


Hi Louie,

This is great, I had a very similar idea but with a more niche target audience. Physical device and all :smiley:

What engine are you using? Some answers seem a bit weak for davinci… but yeah, I faced the same problems when the speech-to-text service did not properly detect the words and sent wrong stuff to the AI.

It was davinci yeah but there were definitely deeper conversations. I just cut the video from snippets that fitted my topic for the end presentation.
One thing that nearly everyone took for granted was that the AI would remember them and that they could continue a conversation any time like with another human. It would be really interesting to implement that feature and build some sort of long term memory. That´s what i’m trying to understand right now. I’ve seen some examples to use the model to summarize past conversations and use this compressed data as memory. Do you have some experience with this?

I know does something like that, it keeps a memory of the user’s profile.

I can recommend you @daveshapautomator 's book on Natural Language Cognitive Architecture, its a free download and it covers a lot of this topics. Might be a bit overkill but its good material: David K Shapiro - NLCA

1 Like

I’m already half way through the book :slight_smile: Will try to implement some ideas soon.
And Replika i knew too but will have a closer look. Thanks!