I’d like to stream structured output responses, but I noticed that client.beta.chat.completions.parse doesn’t seem to support streaming.
Has anyone successfully implemented streaming structured outputs, or does this require a workaround without using a pydantic class for response_format? Would love to hear if there are best practices or alternative approaches
It should. You can up your API game with the streaming helpers, that also have response object collectors.
Example, switching over to my “python examples” directory.
with client.beta.chat.completions.stream(
model="gpt-4o",
messages=[{"role": "system", "content": "You are a helpful AI assistant"},
{"role": "user", "content": request_content}],
stream_options={"include_usage": True},
max_completion_tokens=2000, # openai.LengthFinishReasonError if JSON unparsable
#logprobs=True,
#top_logprobs=1,
response_format=SimpleResponse,
#tools=NotGiven # cannot be empty list
) as stream:
for event in stream:
process_event(event)
You get to write your own process_event handler; a hint, though:
Or get really dirty into the API SDK and wait for things to break:
from openai._types import NOT_GIVEN, IncEx, NotGiven, Union, Any
from openai import BaseModel
from openai._streaming import json
from openai.lib import pydantic_function_tool
from openai.lib.streaming.chat import ChatCompletionStreamManager
from openai import ContentFilterFinishReasonError, APIResponseValidationError
from openai import Client
client = Client()
with ChatCompletionStreamManager(
api_request=lambda: client.chat.completions.create(...