Realtime API audio input tokens usage adding up every question

pedroavex · January 22, 2025, 6:53am

Hi there!
I am creating a concept voice assistant using WebRTC and the model ‘gpt-4o-mini-realtime-preview-2024-12-17’. The App is working ok however due to default cumulative context window the API cost increases very quickly, as for each new question inside the same session the input audio tokens add up contuinuously. I tried using an empty input array “input”: in my ‘response.create’ object but it didn’t work. Something like this:

dataChannel.current.send(JSON.stringify({
type: “response.create”,
response: {
input: , // This removes all previous context
modalities: [“audio”, “text”]
}
}));

Does someone have any idea to help?
Thanks!

j.wischnat · January 22, 2025, 3:37pm

You can create a new session each time you ask a question - of course the AI won’t remember anything of the previous session. There is no workaround. You will always have to pass it the context it needs in order for it to know what you are talking about.

pedroavex · January 22, 2025, 6:29pm

Hi! Well, i did something different and apparently it worked: as soon as i receive a ‘response.done’ from the role:assistant, i get this item ID and send the event to delete it. Maybe it is less aggressive than restart the session, but i will think about it. Thanks!

Topic		Replies	Views
Realtime API re-consuming it's own output audio as input audio API audio , realtime , api-realtime , api-realtime-speech	10	1010	January 10, 2025
Lets break down the input/output token details together! API realtime	3	1316	October 6, 2024
Responses API high token consumption API responses , responses-api	7	233	June 23, 2025
Assistant API token Usage - promt_tokens usage is too high API api-usage , assistants , assistants-api	8	1984	April 10, 2024
Reset Conversation in Realtime API API realtime	11	1496	December 5, 2024

Realtime API audio input tokens usage adding up every question

Related topics