Agents SDK streamed message_output_created events don't align with final_output of run result

jschlach · May 27, 2025, 3:03pm

I don’t understand the relationship between streamed message output events and the final_output of the run result. I would expect that the final_output is composed of all the aggregated streamed message output events, but that is not the case. I believe this is a conceptual misunderstanding, but I can’t find any thorough, in-depth documentation on how streaming works in the agents SDK with multiple tools involved.

For example, I have this basic workflow:

User asks agent a question
Agent uses web_search_preview tool to search some information
A message output is created that answers the question using the information from the web search
Per my prompt instructions, the agent then calls another tool to save this output to my database
Another message output is created, which is very similar to the one in step 3 but also includes the fact that the agent saved the info. This is what’s contained in the final_result of the run result object.

My question is, in a streaming context, I don’t see a way to identify the message output that’s generated in step 3 as not being the “real final output”.

For example, if I’m just passing along message output deltas to the user, they would first see the entire output of step 3, then the entire output of step 5. However, only the output of step 5 is the real “final output” that would be returned if I ran this in a sync context, and presumably the response I would want to save in a DB say for conversation persistence. However, inspecting both the raw & item stream events, I see no logical way to distinguish between the fact that the streamed message output events created in step 3 are actually NOT part of what will ultimately be the final output of the agent. What am I missing?

dmitryrichard · May 28, 2025, 2:19am

are u payloading and how are u payloading? what index are you using

cuz payloading like this you said streaming, i depreciated my obs → python code, bu by doing this, you can achieve concurrent continuous payloads

especially since i think you want like long form outputs? id poll that bro bro

   print(f"⏳ Polling operation status from: {operation_poll_url}")
    while elapsed_wait_seconds < max_poll_wait_seconds:
        poll_response = requests.get(operation_poll_url, headers=headers)
        
        if not poll_response.ok: # Handle errors during polling
            print(f"❌ Polling failed with status: {poll_response.status_code}")
            try:
                print(f"   Polling Response JSON: {poll_response.json()}")
            except json.JSONDecodeError:
                print(f"   Polling Response Text: {poll_response.text}")
            poll_response.raise_for_status() # This will raise an HTTPError

        operation_status_data = poll_response.json()

but fyi i believe its more effective to vector the data first, you are already using agents so just aquire stream metadata and rebuild it using a ruleset.

jschlach · May 28, 2025, 1:42pm

I’m using the openai agents python SDK and streaming results, like result = Runner.run_streamed(agent...). Unsure why I can’t link to docs.

My question is specific to the behavior of streaming with the agents SDK. I’m wondering why sometimes intermediate tool calls within the orchestration path produce message output events which are not included in the agent’s final output. I’m not sure how I should interpret or understand these, and there’s not a clear logical way I see to distinguish message output events from intermediate tool calls from the “final” message output events whose deltas actually contribute to the agent’s final output.

dmitryrichard · May 28, 2025, 1:44pm

DUDE I had that issue before

I Got you

have u upgraded ur pydantic ruleset? and schema enforcement so that they model cant just randomly disobey?

jschlach · May 28, 2025, 2:08pm

Yeah I don’t think it’s a pydantic issue as I’m up-to-date and not using structured outputs. I think it’s more a fundamental question about how the agent works and when/why it produces message outputs. Going back to the example:

User asks a question
Agent invokes my custom tool, and the tool returns the raw string “Success”.
I receive “message output” events in the streamed response which contain what the agent has to say about this tool use. For example, something like “I called this tool and the invocation was successful”. So note this is NOT the tool output; this is the agent generating a message about based on the current context.
The agent then calls another tool, per the system instructions.
The agent then produces another message output, same as (3). This message output is the final_output of the Runner results.

So my question is, how should I understand the message output created in 3? Why is the agent producing message output when it’s still not done within this orchestration cycle? Why is that message output not included in the final_result? Since it’s not in the final_result, I might want to not show it to the user, but I see no way to identify the fact it’s not the “final_output” as I’m reading the stream in real time (ie I only know it’s not the “final_result” at the very end of the stream once (5) is complete

dmitryrichard · May 28, 2025, 2:13pm

bro bro - pydanci allows u to set the rules… for orcehstration

so you just create a rule schema that enforces delivery

the orchestrator controls that, if u want it sequentially u just payload it and make sure ur pipeline logic is sound,

if u dont want to pay load it, then parse it from a jsonl or metadatastore

Topic		Replies	Views
Issue with New Responses API - “400 No tool call found for function call output with call_id” API responses-endpoint	11	1570	April 5, 2025
Auto tool call streaming differentiation is unintuitive Feedback api	3	79	May 19, 2025
Problem with Responses API + Stream + Function Calling with Openai Python SDK API python , function-calling , sdk , streaming , responses-api	8	181	May 22, 2025
What do I do after using submitToolOutputs to send data back to the run API	4	1630	August 6, 2024
How can I use function calling with response format (structured output feature) for final response? Feedback gpt-4 , assistants-api	11	4086	May 30, 2025

Agents SDK streamed message_output_created events don't align with final_output of run result

Related topics