Semantic VAD: Request for Additional Eagerness Settings Below "Low"

I’ve been using semantic_vad with eagerness set to low in my voice assistant app, and while it’s a significant improvement over the default settings, I’m finding it still interrupts users a bit too frequently during natural pauses in speech.

This feedback isn’t just anecdotal—I’ve heard similar concerns from multiple users during testing. They report feeling cut off mid-thought, particularly during:

  • Longer, more complex responses

  • Moments when they’re thinking/formulating their next point

  • Natural conversational pauses

Request: Would it be possible to add more granular control options below the current “low” setting? Something like:

  • very_low or patient mode

  • A numeric scale (e.g., 0-10) for finer control

  • Configurable timeout thresholds in milliseconds (while preserving semantic_vad and not switching to server_vad)

I understand there’s a tradeoff between responsiveness and patience, but having more control would let developers tune the experience for their specific use case and user base.

Has anyone else experienced this? Would love to hear if others have found workarounds or have similar needs.

Thanks!

BTW: here’s the code snippet I’m using:

    switch turnDetectionMode {
    case .semantic:
      session["turn_detection"] = [
        "type": "semantic_vad",
        "create_response": true,
        "eagerness": "low"
      ]
1 Like