Enabling Audio Access for GPT-4o via API

I’m exploring GPT-4o and notice options for text and vision, but I don’t see any for voice. Will that be available soon? Additionally, will there be capabilities for audio input and output like the earlier demonstrations? These features are really exciting, and I appreciate how you addressed latency concerns!