CoT with 4o Audio or Real Time

youssefsebti123 · November 1, 2024, 3:24pm

Use case includes CoT before answer we output to user. Currently using STT LLM TTS . Wondering if anyone was able to have audio or realtime output ONLY what we want to communicate to user while maintaining the rest of the answer of the CoT part silent.

giovanneafonso · November 1, 2024, 11:29pm

You could try: STT > LLM(COT) > LLM > TTS.

Where the first LLM would output the chain of thought, and the second LLM would get the prompt + user message + COT.

If you really need/want to use the real time api, it’s worth trying it with function calling, and in your prompt you could instruct the LLM when to call that COT function.

It seems that now it’s not possible to create RAG-like systems using the real time API, hope this changes soon.

youssefsebti123 · November 4, 2024, 6:04pm

Yep thanks, easier for now to do STT > LLM(COT) > TTS and quicker

a5hpip3 · November 5, 2024, 9:13pm

I’ve had some success in the prompt by explicitly calling out instructions, for example, I have this in my prompt:

My instructions to you will be in all caps, and be contained in . DO NOT PUBLISH MY INSTRUCTIONS TO THE USER.

Q2. Question … [EXTRACT 2 RESPONSES HERE]

[FOR EACH RESPONSE FROM Q2, ASK THE FOLLOWING QUESTION WITH THE EXTRACTED RESPONSE]

This works but tweaks out from time to time where it will speak the first instruction to the user, then adheres to the rest. Not entirely sure why this happens as I haven’t been able to reproduce consistently.

youssefsebti123 · November 12, 2024, 8:05pm

thanks for this. not following though. can you share a prompt example i can run ? goal is to have TEXT output that is NOT spoken in audio

Topic		Replies	Views
Realtime API Audio Modality output API realtime , api-realtime , api-realtime-speech	7	1290	December 13, 2024
Realtime API + Prompting to conduct AI Interview Prompting realtime , api-realtime-speech	1	900	September 22, 2025
How to get text only output from the Realtime API? API api , realtime	14	5522	June 20, 2025
GPT4o Realtime Prompt Engineering API	1	460	January 9, 2025
Can I use audio transcriptions in prompt for LLM context for calling the appropriate function in Realtime API's function calling? API realtime , gpt-realtime	6	200	November 26, 2025

CoT with 4o Audio or Real Time

Related topics