Hi OpenAI team,
I’m currently working on a real-time voice assistant that helps users fill out stuff and navigate support workflows. I’m using GPT-4o and need access to the speech-to-speech WebSocket API (wss://api.openai.com/v1/audio/speech-to-speech
) for a seamless, full-duplex voice experience.
My use case involves:
- Real-time voice interaction
- Natural two-way conversation
- Instant TTS playback with interruption handling
- Live transcription
I’ve already integrated Whisper, GPT-4, and Eleven Labs in a chained setup, but am now hitting latency ceilings. This realtime S2S API would solve that.
Please let me know if my account can be enrolled in the beta or if there’s an application process.
Thanks so much!