Streaming using Structured Outputs

sebastian.chejniak · September 10, 2024, 3:44pm

Maybe I’m missing something, but I had no issues streaming output from the structured outputs API in a simple way. The client.beta.chat.completions object has a .stream method seemingly tailor-made for this.

e.g. this function I created works perfectly fine, and extracts the cumulative streamed response, as well as the final token usage

from openai import OpenAI

# Generator
def openai_structured_outputs_stream(**kwargs):
    client = OpenAI()

    with client.beta.chat.completions.stream(**kwargs, stream_options={"include_usage": True}) as stream:
        for chunk in stream:
            if chunk.type == 'chunk':
                latest_snapshot = chunk.to_dict()['snapshot']
                # The first chunk doesn't have the 'parsed' key, so using .get to prevent raising an exception
                latest_parsed = latest_snapshot['choices'][0]['message'].get('parsed', {})
                # Note that usage is not available until the final chunk
                latest_usage  = latest_snapshot.get('usage', {})
                latest_json   = latest_snapshot['choices'][0]['message']['content']

                yield latest_parsed, latest_usage, latest_json

Usage:
So you can stream the output e.g. as a pandas dataframe as below (though it looks ugly, since this example refreshes the entire dataframe every chunk - purely done for illustrative purposes):

from IPython.display import display, clear_output

for parsed_completion, completion_usage, completion_json in openai_structured_outputs_stream(
    model=model_name,
    temperature=temperature,
    messages=messages,
    response_format=YourPydanticModel
):
    clear_output()
    display(pd.DataFrame(parsed_completion))

Notes:
There are three chunk types, one with chunk.type == 'chunk', chunk.type == 'content.delta', and chunk.type == 'content.done' - hence the need for the if statement to only use one of them (they share lots of data). I believe the content.delta type contains the changes between consecutive chunks.

Topic		Replies	Views
Json format causes infinite "\n \n \n \n" in response API gpt-4 , api , json-mode	21	9351	April 30, 2025
Auto tool call streaming differentiation is unintuitive Feedback api	3	83	May 19, 2025
Incomplete stream chunks for completions API API api , completions	8	3161	September 25, 2023
Streaming with recursive function / tools calling API gpt-4 , functions , streaming	13	3349	April 3, 2025
API response is not JSON parsable despite specified response format API api , response_format , gpt-4o-mini , structured-output	13	1817	November 21, 2024

Streaming using Structured Outputs

Related topics