Is it possible to stream using structured outputs?
Say I define, using Pydantic or an equivalent, a structure that looks like this:
Person = {
"name": <string>,
"age": <number>,
"profession": <string>
}
And I give GPT a prompt like “The user input will be a story. Read the story and identify all of the characters identified in the story.”
And I want GPT to return an array of Person objects.
Rather than return them all at once as an array, is it possible to have GPT send me each person as it identifies them, one by one? This would be very helpful as it would allow me to reduce the time to get the first piece of information back.
it is possible. but you need to have a code to parse the chunks. i have not used the helper libraries so i do not know if they are equipped to get you values even when the result is not completed yet. but basically that is what you need to do on your end.
2 Likes
Hey @allyssonallan , I’ve just gone through the tutorial you linked. There were some good examples in there, but nothing about streaming. Did I miss something in there?
Hi @expertise.ai.chat , I managed to find a workaround by creating a wrapper for the Pydantic base class and process the json schema the same way the streaming beta api is doing.
from pydantic.json_schema import (
DEFAULT_REF_TEMPLATE,
GenerateJsonSchema,
JsonSchemaMode,
model_json_schema
)
from typing import Any
from pydantic import BaseModel, Field
from openai.lib._pydantic import _ensure_strict_json_schema
class BaseModelOpenAI(BaseModel):
@classmethod
def model_json_schema(
cls,
by_alias: bool = True,
ref_template: str = DEFAULT_REF_TEMPLATE,
schema_generator: type[GenerateJsonSchema] = GenerateJsonSchema,
mode: JsonSchemaMode = 'serialization'
) -> dict[str, Any]:
json_schema = model_json_schema(
cls,
by_alias=by_alias,
ref_template=ref_template,
schema_generator=schema_generator,
mode=mode
)
return _ensure_strict_json_schema(json_schema, path=(), root=json_schema)
Your classes should inherit from BaseModelOpenAI and then you need to pass the response format as follows:
{
"type": "json_schema",
"json_schema": {
"name": response_class.__name__,
"schema": response_class.model_json_schema(),
"strict": True
}
Then you can to use the standard client.chat.competions.create
to send your request and get a streaming response.
2 Likes
Wow @andreasantoro.pvt! This is really cool. It is streaming the json response.
Do you happen to know, though, is there a way to get it to stream one key and value at a time rather than word by word?
So for example, if I should get back
{
key1: val1,
key2: val2,
key3: val3
}
I’ll get back
{key1: val1}
{key2: val2}
{key3: val3}
or something similar instead of
{
key1
:
val1
,
key2
:
val2
,
key3
:
val3
}
Thanks
@expertise.ai.chat
I don’t think so since the generation happens token by token. If it’s just a matter of displaying the result to your users, you could accumulate the chunks contents until a new “:” sign (or “}”) has been reached
1 Like