WhatsApp - Speak/Listen/Learn

sanjaymk · March 15, 2023, 7:13pm

The goal is to lower the barrier for non-english speaking folks in using the OpenAI stack. You can text too but speech is a new addition.

Demos: I have conversations in 5 languages (English, Hindi, Spanish, Portuguese, Indonesian) covering 1 billion WhatsApp users https://yella.co.in/call/html/index.html

Trial Link: https://wa.me/14087570747

Some initial observations of quirks in the current tech stack:

Its SLOW (~10 secs e2e). About 3 secs whisper, 5 secs ChatGPT, 2 secs rest. Running on VERY low end h/w currently.
Whisper has human levels of error rates for hi-resource languages (english, spanish, french…)
With low resource languages (even ones like Kannada mentioned in the Whisper docs as having low error rates), Whisper often picks the wrong language (Tamil e.g). This is a failure imo
With hi resource languages, ChatGPT produces results in the spoken language.
-With low resource languages, ChatGPT leans towards English output. This is a failure imo because english is just not spoken/read in most low resource language territories.
-With low resource languages, ChatGPT often produces utter garbage So, low resource just doesn’t mean training data for Whisper. It also applies to GPT training data.

Topic		Replies	Views
WiseTalk App: Voice-Enabled ChatGPT for Android & iOS Community announcement , chatgpt , openapi	2	3741	December 16, 2023
Talk2Yella - a speech interface to ChatGPT (iOS) Community	1	642	December 16, 2023
Real-time voice conversations with GPT-4o photo/video support Community gpt-4 , project , api , gpt-4o , api-realtime-speech	2	1011	November 11, 2024
Gpt-4o or whisper for kids speech Community whisper , audio	4	920	July 12, 2024
Speech to Text (Whisper) to Review (ChatGPT) API whisper	1	2112	October 4, 2023

WhatsApp - Speak/Listen/Learn

Related topics