Is it possible to extract the emotion or tone of speech from a voice recording using the audio transcription models available on the API viz
Currently it only does STT but I’d also like to extract the tone from speech as well.
Interesting Idea! That would be a great way to get more bandwidth from a recording.
Yes, it would be really interesting to see if this is possible.
Though in my experiments so farI have been unable to get this from the model(s).
Here, figure out if this is a scam to fleece VC investors
“just 20 seconds of speech - plus a quiz about your mood”