Better workaround:
- stream the response, so you are getting tokens as they are generated,
- start sending response sentences for TTS as soon as they are received,
- buffer and assemble audio stream, initiating WebRTC playback after buffer underruns are unlikely.