Simple hack to make LLMs listen to an audio rather than only reading

Hey folks :waving_hand:

I recently wrote a blog on how LLMs can โ€œlistenโ€ to audio โ€” not just read transcripts โ€” by combining Whisper outputs with pitch, RMS, and other acoustic features.

:link: Read it here: LLMs Meet Audio: Teaching AI to Hear Emotion, Not Just Read It

Would love to get feedback from others working with Whisper, embeddings, or sentiment from speech!