Parsing JSON stream response in nodejs

I want to parse a streamed chat completion response. The response will be a JSON array.

I’m using the new node API Library v4.0.1. The response is a JSON array, and I need to parse individual elements as they are returned. I tried using a node library for stream parsing - stream-json - but the types seem incompatible.

Has anyone been able to achieve something similar?

1 Like

Is there a reason your using streaming versus just parsing the standard response after it completes?

The response I am generating is fairly big (10-15 question answers). Wanted to cut the time it takes to show something to the user.

I have written custom code to achieve it. Wanted to see if there was a way to use an existing library like stream-json

My Custom Code:

const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY // This is also the default, can be omitted
  });

  const stream = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    stream: true,
    messages: [{ "role": "system", "content": "You are a helpful assistant." }, { role: "user", content: "Return an array of 5 JSON objects. Each object contains two keys - head and body. Values are random words.  Return only the JSON array. Do not include any additional commentary in the response." }],
  });

  let data = ''; // To accumulate the chunks of response data

  for await (const part of stream) {
    const chunk = part.choices[0].delta.content || "";
    data += chunk // accumulate

    const endIndex = data.indexOf('}');
    if (endIndex !== -1) {
      const startIndex = data.indexOf('{');

      const jsonObject = data.slice(startIndex, endIndex + 1); // Extract the JSON object
      data = data.slice(endIndex + 1); // Remove the extracted JSON object from the accumulated data
      try {
        const parsedObject = JSON.parse(jsonObject);
        console.log(parsedObject); // Handle the parsed JSON object here
        res.write(jsonObject);

        // Make an API call
      } catch (err) {
        console.error('Error while parsing JSON:', err);
      }
    }
  }

You should ask for multiple json then.

Of course a json lib won’t be able to parse a string like this:

{
 "key":"this message was cut off here ->

because that’s what happens then. You are trying to parse an incomplete json.

Or do it like I do. Use hundrets of parallel requests and get tiny chunks.
More expensive but also more accurate.

1 Like

There’s a library created for streaming JSON from OpenAI (doesn’t let me post links, if any one needs it feel free to contact me at gonzalezneddy@gmail.com :slight_smile: ), hope it’s useful; also, if anyone knows how to use this with Swift, would appreciate the hand. Have a good one!

1 Like

Thank you for the link - optimistic json parsing - very interesting:

https://www.mikeborozdin.com/post/json-streaming-from-openai

Got to test that!

:slight_smile:

2 Likes

I built a few tools to let you parse the streaming JSON and start consuming it safely before the stream is completed.

will work on the server or in the browser

supports all valid JSON, deeply nested multi-dimensional, whatever.

every key within the data structure will stream independently - and the entire model will be stubbed out based on the schema - so it is safe to parse asap and during the stream - u can also provide defaults for any property to be set before the value streams in

ready to use tools for typescript here:
zod-stream on npm
stream-hooks on npm

and the streaming json parser that powers them is schema-stream on npm

Browser/Next.js example:

create the stream completion - can use my helpers here - this just converts the zod schema to the proper JSON schema and tool function parameters - then converts the SSE response to a readable stream.

  import OpenAI from "openai"
  import { z } from "zod"
  import { withResponseModel, OAIStream } from "zod-stream"

  const oai = new OpenAI({
    apiKey: process.env["OPENAI_API_KEY"] ?? undefined,
    organization: process.env["OPENAI_ORG_ID"] ?? undefined
  })

export async function POST(request: Request) {
  const { messages } = await request.json()

  const params = withResponseModel({
    response_model: { 
        schema: z.object({ content: z.string() }), 
        name: "Content response" 
    },
    params: {
      messages,
      model: "gpt-4",
      stream: true
    },
    mode: "TOOLS"
  })

  const extractionStream = await oai.chat.completions.create(params)

  return new Response(
    OAIStream({
      res: extractionStream
    })
  )
}

simple react hook consumer

    import { useJsonStream } from "stream-hooks"
    const { loading, startStream, stopStream, data } = useJsonStream({
      schema: z.object({
        content: z.string()
      }),
      onReceive: data => {
        console.log("incremental update to final response model", data)

         console.log(data.content) // this json stream is now fully safe to parse and read from before it has completed
      }
    })

vanilla consumer

      import ZodStream from "zod-stream"
      const streamClient = new ZodStream()

      const extractionStream = await streamClient.current.create({
        completionPromise: async () => await fetch("/path/to/stream")
        response_model: { schema } // should match schema of stream
      })

      for await (const data of extractionStream) {
        console.log(data.content) // safe to parse partial json 
      }

For others looking for Python solution, looks like there are 2 libraries that can fix a broken invalid JSON.

  1. json-fixer
  2. json-repair