Gpt-realtime-1.5: text output mode broken when tools are enabled

z.altawil1 · February 25, 2026, 7:40pm

I’ve been using gpt-realtime-1.5 for a couple of days now and ran into an interesting issue. When using output_modalities=[“audio”] , the model works great. But when I switch to
output_modalities=[“text”] with tools enabled and rely on an external TTS, the performance drops significantly compared to gpt-realtime.

Issues I’m seeing in text-only mode:

Model wraps normal conversational responses in curly braces {} as if it’s outputting JSON
Function call arguments leak into the text output channel (the TTS literally tries to speak the function call JSON)
Internal control tokens leak into the output, e.g.: <|aesthetics_3|><|has_watermark|>
Ignores language instructions that gpt-realtime followed perfectly

None of these issues exist with gpt-realtime in the same configuration, or with gpt-realtime-1.5 in audio output mode. Seems specific to text mode + tools.

melindacr · February 25, 2026, 11:44pm

I would like to second that there is something very very wrong with output_modalities=[”text”] on the new model. Almost every response it gives is somehow wrong, or is a tool call at the incorrect time. After an incorrect tool call or response, it follows up with an “oops, I messed that up, let’s try again” and tries to continue.

z.altawil1 · February 26, 2026, 12:04am

Yes this is happening to me too , and for some reason the first turn/message is always stuck till a second message comes

vb · February 26, 2026, 12:09am

Hi and welcome to the community!

I can also reproduce several of the behaviors you described:

In text-only mode, the model does return JSON-like content (for example, normal replies wrapped in { ... }) instead of a natural conversational answer.
I also see tool-related JSON leaking into the user-facing text output in this setup, which would cause an external TTS that reads the text stream to literally speak JSON.
In the same configuration, I see weaker adherence to instructions compared to audio output mode.

Will ping the team to take a look!

Ps. I did not capture the “internal control tokens” leak (<|aesthetics_3|><|has_watermark|>) in my tests. If anyone can share request IDs that will be helpful.

z.altawil1 · February 26, 2026, 12:19am

Thanks for reproducing and escalating!

Unfortunately I didn’t capture the specific request IDs for the control token leak at the time I’ll start logging them and share as soon as I can reproduce it again.

Topic		Replies	Views
GPT realtime MINI calls tools incorrectly - Seems to be an error in the model API gpt-realtime	2	86	February 20, 2026
Gpt-3.5-turbo BROKEN: context-loading/response issue - emitting only functions never specified Bugs	2	108	August 14, 2025
GPT-5.2 fails to call tools properly API bug , bugs , responses-api	5	629	February 25, 2026
Responses API Bugs - duplicate output messages and tool calls Bugs	4	1274	July 6, 2025
[BUG] Realtime API - Transcript returns other users' data and internal tokens (gpt-realtime-2025-08-28) Bugs api , api-realtime	7	153	December 22, 2025

Gpt-realtime-1.5: text output mode broken when tools are enabled

Related topics