Hi everyone,
I’ve noticed an issue with the advanced voice mode that’s affecting the natural flow of conversations. While I understand the system needs to process speech in chunks, the current implementation is creating some usability challenges:
- Main Issue: There’s a timing mismatch between user input and system response that leads to frequent interruptions.
- Specific Behaviors:
- The system interrupts during natural speaking pauses
- When trying to continue speaking, there’s an awkward overlap where both the user and system cut each other off
- This creates a cycle of starting and stopping that makes it difficult to complete thoughts
- Impact:
- Conversations feel disjointed
- It’s challenging to maintain a natural speaking rhythm
- Users often have to restart their sentences multiple times
Has anyone else experienced this? I believe adjusting the pause threshold or implementing a more sophisticated turn-taking mechanism could help resolve this.
Any insights or workarounds would be greatly appreciated.