Realtime api unable to understand alphanumeric codes

xid · January 10, 2025, 10:03am

Hi everyone,

i’ve created a virtual assistant using the realtime api, the assistant ask an alphanumeric code to the user and ask if correct.
The problem is that the model is unable to extract the code from the user audio (even though the whisper trascription is correct).

Conversation example:
User: my code is RTA34FR
Whisper transciption: my code is RTA34FR
Assistant: Your code is RTEE4TC, is it correct?

I also tested with the playground platform but i’m facing the same issue, so i assume it is not an audio quality issue.

Do you have any suggestion?

j.wischnat · January 10, 2025, 10:14am

Hey and welcome to the forum!

I’ve had this issue as well with phone numbers and more complicated numbers or combinations of single letters and numbers.
This is mainly a problem in almost anything AI-Related and there is no real fix yet.

A good workaround would be to handle the codes by calling the whisper endpoint in addition to the realtime model (for example with tool use) which might take longer but at least the codes will be accurate.

Cheers.

avinash.shetty · January 15, 2025, 5:19am

Hi @j.wischnat,
How do you achieve this? By passing the chat transcript to GPT again to revalidate the input text and return the actual user input or some other way?

Topic		Replies	Views
Realtime api not understand phone number API realtime	14	1033	January 23, 2025
Why is realtime model so bad at understanding sequences of numbers? API realtime	18	1386	April 29, 2025
Real Time API invents numbers or does not understand number sequences Bugs	11	240	January 22, 2025
[Realtime API] Audio Output Numbers Wrong Bugs realtime	3	310	March 17, 2025
Realtime API Gets Names Horribly Wrong API realtime	12	875	March 11, 2025

Realtime api unable to understand alphanumeric codes

Related topics