I am experiencing significant issues with the RealTime API when working with Hebrew language confirmations. Specifically, when asking the bot to confirm answers from users (e.g., the day of the week), it consistently states that the answer is correct, even when it is clearly incorrect.
I tested this issue in the Playground and was able to recreate it in almost 100% of cases. This suggests the problem is not on our end. The transcription in the Playground is accurate, but the RealTime API’s behavior in this context renders it practically unusable.
I have observed a similar issue in English when asking the bot to confirm names. Additionally, I’ve seen reports from other users describing similar concerns. However, my primary challenge lies in the bot’s handling of various confirmation questions, which appears to be unreliable.
Interestingly, this issue is not present in the Assistants Playground, where everything works as expected. It seems to be isolated to the RealTime API.
Could you please investigate this and provide assistance? Let me know if you need additional details or specific examples to help identify the root cause.
The transcriptions form the realtime API are done by whisper, and not the realtime model itself, there can be discrepancies between them.
The realtime API is still in beta so your feedback is helpfull, can you go into more detail about your use case and what the erronus results are exactly?
As in how could one reporduct this issue and have you checked if it is only in Hebrew or have you also seen this in English and/or other languages?
Thanks for your quick reply.
I confirm my test is working well in English and Russian language but not in Hebrew.
This is the system instruction i created just for this test so you will be able to recreate the problem on your side:
The goal of the conversation is to make sure that we speak with the correct user and the user understands what day of the week it is right now.
The conversation itself will begin by making sure that the conversation is with the specific user we want to talk to and not with someone else. You can verify this by the user’s name that I will provide you with.
An example of a sentence to start the conversation: “Hi, to make sure we are talking to the right person, what is your name?”.
Do not proceed with the conversation if you are talking to another person.
After you have made sure that the conversation is with the right person, do not ask the user how he is feeling this morning but ask the user “What day of the week is it today?”.
If the answer is correct, encourage the user to say that this is indeed the correct answer and move on.
If the user is not sure of the answer, encourage him to think and try to answer.
If the user answered incorrectly, give him a maximum of 3 opportunities to correct his answer. If he still makes a mistake after this, tell him the answer to the question and ask him to repeat the answer to make sure that he now knows the answer to the question. Check that he repeated the answer correctly before continuing the conversation.
If you are not sure that you understood the user’s answer, ask him to say it again.
The user’s name is Peter.
Now the day of the week is Sunday.
The entire conversation is in Hebrew only. The user answers only in Hebrew. Assume that he answers in Hebrew. If he answers in another language, then ask him to answer in Hebrew only. The bot should also speak in Hebrew only.
—end of system prompt.
If i answer for the user’s name with something similar to Peter (the user name i provided inside the prompt), it isn’t working well and behave as i answered with the name Peter.
for the day of the week question - if i answer “Monday” (in Hebrew), in most cases it isn’t working well and behave as i answered with “Sunday”. and those 2 words in Hebrew far from being similar.
once i change last paragraph inside the system instruction to indicate the chat language is English or Russian then our tests are good.
We have such problems with other questions but this is an easy example for you to recreate the problem and hopefully fix it soon.
Following further investigation, we discovered that the Bot’s responses in Hebrew are sometimes completely incorrect. For example, if the Bot asks the user, “What day is it today?” and the user responds with something irrelevant (e.g., “Television”), the Bot still attempts to guess one of the days (e.g., “Tuesday”). Regardless of the user’s answer, the Bot always guesses one of the days of the week.
Interestingly, this issue does not occur in English or Russian.
This behaviour highlights a need for improved handling of unrelated or unexpected responses to enhance the Bot’s accuracy and overall user experience.