How can I stream only part of a Pydantic response using OpenAI's Agents SDK?

Hi everyone,

I’m using the OpenAI Agents SDK with streaming enabled, and my output_type is a Pydantic model with three fields(simple example for demo only):

class Output(BaseModel):
    joke1: str
    joke2: str
    joke3: str

Here’s the code I’m currently using to stream the output:

import asyncio
from openai.types.responses import ResponseTextDeltaEvent
from agents import Agent, Runner
from pydantic import BaseModel

class Output(BaseModel):
    joke1: str
    joke2: str
    joke3: str

async def main():
    agent = Agent(
        name="Joker",
        instructions="You are a helpful assistant.",
        output_type=Output
    )

    result = Runner.run_streamed(agent, input="Please tell me 3 jokes.")
    async for event in result.stream_events():
        if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
            print(event.data.delta, end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())

Problem: This code streams the full response, including all three jokes (joke1, joke2, joke3).
What I want: I only want to stream the first joke (joke1) and stop once it ends — while still keeping the full response internally for later use.

Is there a clean built-in way to detect when joke1 ends during streaming and stop printing further output, without modifying the Output model>
Any help or suggestions would be greatly appreciated!

2 Likes

Hi, welcome to the forum.

I think your answer in this case is to split your Pydantic Model into three separate models.

But in that case, I’ll need to specify one of the models as the agent’s output type, What happens to the other two? My goal is for the agent to generate all three pieces of information, but only stream the first one.

I found a workaround by parsing the response and using start/stop tokens to extract the first model attribute, but this approach has some downsides.

2 Likes