Realtime API audio input tokens usage adding up every question

Hi there!
I am creating a concept voice assistant using WebRTC and the model ‘gpt-4o-mini-realtime-preview-2024-12-17’. The App is working ok however due to default cumulative context window the API cost increases very quickly, as for each new question inside the same session the input audio tokens add up contuinuously. I tried using an empty input array “input”: in my ‘response.create’ object but it didn’t work. Something like this:

dataChannel.current.send(JSON.stringify({
type: “response.create”,
response: {
input: , // This removes all previous context
modalities: [“audio”, “text”]
}
}));

Does someone have any idea to help?
Thanks!

1 Like

You can create a new session each time you ask a question - of course the AI won’t remember anything of the previous session. There is no workaround. You will always have to pass it the context it needs in order for it to know what you are talking about.

Hi! Well, i did something different and apparently it worked: as soon as i receive a ‘response.done’ from the role:assistant, i get this item ID and send the event to delete it. Maybe it is less aggressive than restart the session, but i will think about it. Thanks!