This idea is related to add a layer to the chatbot personality to convey an emotion to the user. This should be done in several scenario, such as reading a novel or assisting an angry guest in an hotel or…(I have many use cases) …or just for fun.
I’m going to build a “support skill” because I won’t be able to get through the certification process and then invite some users to try it based on their Amazon account (I never tried, I’ll have to see how it works). I think we are going to do it with GPT4. I got the API access!
BASIC GOAL: this BOT/Orace of Light cousin/(GPT+Alexa) thing shall be able to detect a sentiment in the User’s prompt and reply by detecting that sentiment and producing an emotion. To produce an emotion we are going to use Alexa’s SSML tags. Probably the result will be mediocre because of many reasons, but it is useful to build a prototype.
In the next years I think the LLM going to win will be also capable of superior sentiment analysis, being able to understand the user’s emotion and personality and being capable to manipulate the user in calming him/her down in stressful scenario or persuading him (doesn’t need to persuade her…) in buying things or other stuff. Also the company able to integrate this chatbot inside a real product (like Astro or Roomba or Hollywood Actress…) is going to be very successful, creating an emotional bind with its user/customer.
Asking chatGPT3.5 how many emotions is able to detect, it will reply:
<<As an AI language model, I don’t have emotions, but I can recognize and classify several human emotions based on text inputs, such as joy, sadness, anger, fear, surprise, and disgust. The number of emotions that can be detected varies depending on the specific algorithm or model used for emotion detection.>>
If I ask chatGPT if it knows Alexa SSML tags, it replies:
<<Yes, Alexa supports SSML (Speech Synthesis Markup Language) tags to convey emotions in speech output. Here are some examples of SSML tags for conveying emotions with Alexa:
Emphasis: The
<emphasis>
tag can be used to emphasize certain words or phrases, which can convey a sense of excitement or urgency. For example,
<speak> I am <emphasis level="strong">so</emphasis> excited to see you!</speak>
.
Prosody: The <prosody>
tag can be used to adjust the pitch, rate, and volume of speech to convey emotions such as excitement, anger, or sadness. For example, <speak> I'm really <prosody pitch="high">excited</prosody> to see you!</speak>
.
Break: The <break>
tag can be used to insert pauses in speech to convey emotions such as hesitation or uncertainty. For example, <speak> I don't know...<break strength="medium"/> maybe we should try something else?</speak>
.
Say-as: The <say-as>
tag can be used to change the way words are pronounced, which can convey emotions such as sarcasm or playfulness. For example, <speak> Oh, <say-as interpret-as="interjection">great</say-as>!
Now my phone is broken.</speak>
.
These are just a few examples of SSML tags that can be used to convey emotions with Alexa. There are many other tags and techniques that can be used as well.>>
So here I think it is better to restrict the domain to a subset.
ChatGPT says that As an AI language model, he doesn’t have emotions.
But it can have a personality, built inside the system variable. The one that you see in the playground at Playground - OpenAI API. Personality could be managed later as an embeddings.
You can help to identify this system variable by playing with the Playground (use the settings provided). Here is an example:
You can help me to develop this personality. Basically I will load and unload a few personality for the users that are going to help me. I will build another skill to test this personalities. You load the personality and then input some test vectors (that you have later to provide me) and you check the results and grade them (if you got the emotion). Then we select a couple of best personality and we will force a positive behavior, like mapping user anger to AI disgust or else. The goal is not tricking the AI or trying to achive something unconvinient. The goal is to analyze current SSML tag and see if we can propose an improvement. We always want to provide the chatBOT with a positive behavior and attitude, being able to calm down a user. If anybody has some advice, please let me know.
I’ll write more in the next days. Bye.