Hello,
I’m using the Realtime API with voice input/output. I have encountered an issue where I say, “My name is Maxim,” and the transcription correctly logs my name as “Maxim.” However, in the voice response, the assistant greets me as “Mark.”
This seems like a hallucination or mismatch between the transcription and TTS response.
Steps to reproduce:
- Connect via the Realtime API with voice enabled.
- Say: “My name is Maxim.”
- Check the logs: transcription is correct.
- Observe voice output: the model replies with “Hello Mark” or similar.
Expected behavior:
- The assistant should respond with the correct name extracted from the user’s input, e.g., “Hello Maxim.”
Is there any workaround for this? For example, setting the name explicitly in the instructions
or updating the session context?
Thanks in advance!