I’m trying to transcribe audio using a WebSocket connection. The transcription session is successfully created, but I am not receiving the transcription text. Could you please guide me in resolving this issue?
this.ws = new WebSocket(`wss://api.openai.com/v1/realtime?intent=transcription`, [
"realtime",
`openai-insecure-api-key.${token}`,
"openai-beta.realtime-v1"
]);
this.ws.onopen = () => {
console.log('Connected to OpenAI realtime API');
// Send configuration once connected
};
this.ws.onmessage = (event: MessageEvent) => {
console.log(event);
}
audioWorkletNode.port.onmessage = (event) => {
if (!this.ws || this.ws.readyState !== WebSocket.OPEN) return;
const inputData = event.data.audio_data;
// console.log(inputData)
if (!inputData || inputData.length === 0) {
console.warn('Received empty audio data');
return;
}
const currentBuffer = new Int16Array(event.data.audio_data);
// Use type assertion to assure TypeScript this is compatible
audioBufferQueue = this.mergeBuffers(
audioBufferQueue,
currentBuffer
);
const bufferDuration =
(audioBufferQueue.length / this.transcriptionContext.sampleRate) * 1000;
// wait until we have 100ms of audio data
if (bufferDuration >= 100) {
const totalSamples = Math.floor(this.transcriptionContext.sampleRate * 0.1);
// Extract the portion we want to send
const dataToSend = audioBufferQueue.subarray(0, totalSamples);
// Encode the Int16Array to base64
const base64Audio = this.encodeInt16ArrayToBase64(dataToSend);
// Update our queue to remove the sent data
audioBufferQueue = audioBufferQueue.subarray(totalSamples);
// Convert to the format OpenAI expects (16-bit PCM)
// const audioBuffer = this.floatTo16BitPCM(finalBuffer);
// const base64Audio = this.arrayBufferToBase64(audioBuffer);
// Send the audio data to OpenAI
this.ws.send(JSON.stringify({
type: 'input_audio_buffer.append',
audio: base64Audio
}));
// this.ws.send(JSON.stringify({
// type: 'response.create',
// }));
}
};
Here I have attached the screenshot of the log also. I couldn’t able to update the session to use gpt-40-mini-transcribe model. I want to use this feature in production site. Could you please guide me to resolve this issue?