Hello I am trying to setup Chinese learning assitant with speech interaction.
This assistant needs to recognize audio in several languages (my native language and Chinese), as well as voice a text response.
Is there still a problem with recognizing multilingual audio?
Is there a similar problem with voice acting?