Do I have access to Realtime API?

I tried using the real-time API for audio transcription, specifically using specifically using gpt-4o-transcribe model. I’m getting responses That makes me believe I may not have access to the real-time API service. Does anyone know if real-time API is generally available? Or is there a special request process to get access?

More details on exact behavior I’m seeing:
I am encountering a consistent 403 Forbidden error when attempting to establish a WebSocket connection to the Realtime API endpoint (wss://api.openai.com/v1/realtime/transcription_sessions). The error message received is: “The server returned status code ‘403’ when status code ‘101’ was expected.”

I have performed the following debugging steps:

  1. API Key Validation: My primary OpenAI API key successfully authenticates with other OpenAI endpoints (e.g., curl https://api.openai.com/v1/models). This shows that the API key is good.
  2. Session Creation via HTTPS POST: Strangely, I am getting a proper response back with an ephemeral client secret that is already expired in the past. Seems like another way of telling me I don’t have access !?

Any guidance or clarification on this 403 error would be greatly appreciated.

Hi!

I guess the first check would be to go to the playground and look at the model list you have access to, is real-time in there?

Could you post a snippet of the setup and connection code you’re using?

Yes, realtime is Available for me in the Playground. I’m able to select gpt-4o-transcribe as the “user transcript model” and gpt-4o-realtime-preview as the “model”. I am able to engage in a conversational back-and-forth, although I am not able to tell if it is using “gpt-4o-transcribe” at all, which is the model of interest for me.

This is the code snippet (C#) where I create the web socket and attempted to connect:

        public async Task StartRealtimeSessionAsync(string prompt, string sessionId)
        {
            LogService.Instance.LogMedium($"[OpenAiModelProvider] Starting realtime session for {ModelId}.", sessionId);
            _realtimeCancellationTokenSource = new CancellationTokenSource();
            _realtimeCompletionSource = new TaskCompletionSource<string>();
            _realtimeTranscriptBuilder.Clear();
            _realtimeTranscriptParts.Clear();
            _lastItemId = string.Empty;

            try
            {
                _webSocket = new ClientWebSocket();
                _webSocket.Options.SetRequestHeader("Authorization", $"Bearer {OPENAI_API_KEY}");
                await _webSocket.ConnectAsync(new Uri(REALTIME_API_URL), _realtimeCancellationTokenSource.Token);

                // Send initial configuration
                var configMessage = new JObject(
                    new JProperty("type", "transcription_session.update"),
                    new JProperty("session", new JObject(
                        new JProperty("input_audio_format", "pcm16"), // AudioService will resample to 24kHz, but OpenAI expects 16kHz for pcm16
                        new JProperty("modalities", new JArray("text")), 
                        new JProperty("input_audio_transcription", new JObject(
                            new JProperty("model", ModelId),
                            new JProperty("prompt", prompt),
                            new JProperty("language", "en") // Assuming English for now, can be made dynamic //todo
                        )),
                        new JProperty("input_audio_noise_reduction", new JObject(
                            new JProperty("type", "near_field") // Enable noise reduction
                        )),
                        new JProperty("turn_detection", (object?)null) // Set turn_detection to null
                    ))
                );
                await _webSocket.SendAsync(Encoding.UTF8.GetBytes(configMessage.ToString()), WebSocketMessageType.Text, true, _realtimeCancellationTokenSource.Token);

                // Start listening for messages in a background task
                _ = ReceiveMessagesAsync(sessionId, _realtimeCancellationTokenSource.Token);

                LogService.Instance.LogMedium($"[OpenAiModelProvider] Realtime session started for {ModelId}.", sessionId);
            }
            catch (Exception ex)
            {
                LogService.Instance.LogError($"[OpenAiModelProvider] Error starting realtime session for {ModelId}", ex, sessionId);
                _realtimeCompletionSource?.TrySetException(ex);
                DisposeRealtimeSession();
                throw;
            }
        }

Model ID is: gpt-4o-transcribe

I think your C# is wrong,

try

const string MODEL = "gpt-4o-transcribe";   // or use ?intent=transcription
var ws = new ClientWebSocket();

// Required sub-protocols (ordered)
ws.Options.AddSubProtocol("realtime");
ws.Options.AddSubProtocol($"openai-insecure-api-key.{OPENAI_API_KEY}");
ws.Options.AddSubProtocol("openai-beta.realtime-v1");

// OR: keep your Bearer header instead of the “insecure” sub-protocol
// ws.Options.SetRequestHeader("Authorization", $"Bearer {OPENAI_API_KEY}");
// ws.Options.SetRequestHeader("OpenAI-Beta", "realtime=v1");

// Correct URI
var uri = new Uri($"wss://api.openai.com/v1/realtime?model={MODEL}");
await ws.ConnectAsync(uri, CancellationToken.None);

You connect to wss://api.openai.com/v1/realtime/**transcription_sessions**

THe realtime API is expecting you to connect to wss://api.openai.com/v1/realtime and pass either
?model=gpt-4o-transcribe or
?intent=transcription

I would try with Python and the OpenAI library to get it up and running first. Then break it down into C# if you have to.

1 Like

Thank you for the help. I am able to achieve a successful connection in Python now. I will port over the changes to C#.

1 Like

The problem was indeed the URL and the query parameters. After fixing them, I was able to connect successfully. In case it’s helpful for others, here’s the updated code snippet that works:

private const string REALTIME_API_URL = "wss://api.openai.com/v1/realtime?intent=transcription";

 public async Task StartRealtimeSessionAsync(string prompt, string sessionId)
        {
            LogService.Instance.LogMedium($"[OpenAiModelProvider] Starting realtime session for {ModelId}.", sessionId);
            _realtimeCancellationTokenSource = new CancellationTokenSource();
            _realtimeCompletionSource = new TaskCompletionSource<string>();
            _realtimeTranscriptBuilder.Clear();
            _realtimeTranscriptParts.Clear();
            _lastItemId = string.Empty;

            try
            {
                _webSocket = new ClientWebSocket();
                _webSocket.Options.SetRequestHeader("Authorization", $"Bearer {OPENAI_TEMP_CLIENT_SECRET}");
                _webSocket.Options.SetRequestHeader("OpenAI-Beta", "realtime=v1");
                await _webSocket.ConnectAsync(new Uri(REALTIME_API_URL), _realtimeCancellationTokenSource.Token);
                // Send initial configuration
                var configMessage = new JObject(
                    new JProperty("type", "transcription_session.update"),
                    new JProperty("session", new JObject(
                        new JProperty("input_audio_format", "pcm16"), // AudioService will resample to 24kHz, but OpenAI expects 16kHz for pcm16
                        // new JProperty("modalities", new JArray("text")), 
                        new JProperty("input_audio_transcription", new JObject(
                            new JProperty("model", ModelId),
                            new JProperty("prompt", prompt),
                            new JProperty("language", "en") // Assuming English for now, can be made dynamic //todo
                        )),
                        // new JProperty("turn_detection", new JObject(
                        //     new JProperty("type", "semantic_vad")
                        // ))
                        new JProperty("turn_detection", (object)null)
                    ))
                );
                await _webSocket.SendAsync(Encoding.UTF8.GetBytes(configMessage.ToString()), WebSocketMessageType.Text, true, _realtimeCancellationTokenSource.Token);

                // Start listening for messages in a background task
                _ = ReceiveMessagesAsync(sessionId, _realtimeCancellationTokenSource.Token);

                LogService.Instance.LogMedium($"[OpenAiModelProvider] Realtime session started for {ModelId}.", sessionId);
            }
            catch (Exception ex)
            {
                LogService.Instance.LogError($"[OpenAiModelProvider] Error starting realtime session for {ModelId}", ex, sessionId);
                _realtimeCompletionSource?.TrySetException(ex);
                DisposeRealtimeSession();
                throw;
            }
        }

Thank you, @Foxalabs

1 Like