Extracting emotion or tone of speech

Is it possible to extract the emotion or tone of speech from a voice recording using the audio transcription models available on the API viz whisper-1 and canary-whisper using prompt param?

Currently it only does STT but I’d also like to extract the tone from speech as well.

1 Like

Interesting Idea! That would be a great way to get more bandwidth from a recording.

1 Like

Yes, it would be really interesting to see if this is possible.

Though in my experiments so farI have been unable to get this from the model(s).

Here, figure out if this is a scam to fleece VC investors

“just 20 seconds of speech - plus a quiz about your mood”