Realtime API WebRTC best security & use practices

n0rdy · May 4, 2025, 6:23pm

Hello there!

I’m building a project with the use of the OpenAI Realtime API via WebRTC, and it made me wonder about the security practices here.

Based on the official docs, the flow is the following:

UI → Server: request ephemeral key
Server → OpenAI: https://api.openai.com/v1/realtime/sessions creates a session and receives the ephemeral key back alongside session params like ID, etc.
Server → UI: returns ephemeral key
UI → OpenAI: https://api.openai.com/v1/realtime?model=${model} begins a session with the use of the ephemeral key
UI → OpenAI via WebRTC: established a peer connection and proceeds with the conversation

Once the conversation is over, the UI closes the peer connection, so this way the Realtime session is finished.

There are a few things that worry me:

there is no way for the Server to stop the conversation by sending a request with the session id to OpenAI. This is useful, if we want to restrict the duration of the conversation server-side (let’s say, based on the user “credits”), as client-side restrictions are easy to bypass
there is no way to get the session metadata (like duration, costs, status, etc.) from the Server by session id. This is useful to prevent the misuse of the conversations, which is easy to achieve by sniffing the ephemeral key that comes from the Server, and using it outside the UI app. Also, if the service charges their users based on the conversation duration / used tokens, there is no way to learn the exact numbers for the particular session, which opens doors for the misuse.

I understand that it is possible to fix that by using WebSockets on the server side, but it brings its own implementation complexity, and additional network traffic costs, as cloud providers charge for the in/outbound traffic, and with audio format, it is quite noticeable.

A question to the fellow developers and builders out there: how do you handle this?

Also, a question to the OpenAI team: are there any plans to extend the Session API with the endpoints to:

stop the session forcefully
get session status and metadata (e.g., costs & duration)
?

Thanks, and have fun.

Foxalabs · May 4, 2025, 8:00pm

It’s a great question and I will make sure it gets raised with OpenAI.

n0rdy · May 6, 2025, 2:57pm

Thanks a lot!

Please, let me know once there is any info from OpenAI on that.

lukerohde · May 19, 2025, 9:27am

I am very keen for the same answer and facing into the same trouble.

nip10 · May 19, 2025, 8:27pm

+1, we need a way to control sessions server side

mueller4leon · June 17, 2025, 9:48am

+1 not being able to get session costs when using webrtc by session id (reliably and fraud-safe) is a blocker for our team sadly

Baur_Baur · June 19, 2025, 8:34pm

+1. I knew that this problem wasn’t just bothering me.

Emin2 · July 11, 2025, 5:26pm

+1, I agree, it gives me difficulty in productizing.

Topic		Replies	Views
RealtimeAPI: WebRTC (Client) + WebSocket (Server) possible? API realtime	12	630	February 23, 2025
Realtime api - get conversation transcription from backend while using webrtc API realtime	2	371	February 3, 2025
OpenAI Realtime API Ephemeral Tokens API realtime , api-realtime	1	1464	February 20, 2025
Realtime API and session costs API advanced-voice , realtime , api-realtime , api-realtime-speech	2	619	November 4, 2024
Handling early conversation closure API function-calling , long-context , voice , realtime	8	836	February 24, 2025

Realtime API WebRTC best security & use practices

Related topics