Hi!
Does Realtime API support responses in both Audio and Text? If yes, how to implement it? How do I ask the model to split between Audio and Text?
As an example, if the model message was:
"Several public licenses allow open-source distribution of software while imposing restrictions on its use. Here are some common ones that provide varying levels of control over how the software can be used, modified, and redistributed:
GNU General Public License (GPL)
Use Case: Ensures that software and its derivatives remain open-source.
Restriction: Any derivative work must also be distributed under the same license, meaning if someone modifies your software, they must release their modifications as open-source.
GNU Affero General Public License (AGPL)
Use Case: Specifically designed for networked software (e.g., web apps).
Restriction: Requires that any changes made to the software, even if it’s just used over a network (like in a cloud service), must also be shared as open-source. It prevents proprietary forks used in hosted environments without releasing source code. "
–
This part of the message should be Audio (+ text):
“Several public licenses allow open-source distribution of software while imposing restrictions on its use. Here are some common ones that provide varying levels of control over how the software can be used, modified, and redistributed.”
The remaining part of the message (details) should be Text only. How can this be implemented? Can the model respond with two different message types “audio” and “text”?
Thanks for the help!!