Streaming Assistant Json Response and Using One Key/Value For Chat

I have a medical coding assistant and would like to keep track of its responses to questions, especially what medical codes it is recommending. I currently have used formatting (e.g. surrounding medical codes with @@@ etc) and it works okay but ideally I’d prefer to have a structured json response. The issue is that some responses can be very long so I would also like to continue using streaming.

Right now when I try to parse the streaming textDelta I obviously get errors because I get values like

{
    "value": "{\"",
    "annotations": []
}

I think this is mostly just something that can be overcome if I spend more time of my processStream function but I want to ask if someone has a smarter way of doing this since it seems like it would be a common use case.

Current processStream (in JavaScript) at the moment:

const processStream = async (res) => {
		clearStreamOutput()
		if (!res.ok) {
			throw new Error("Network response was not ok")
		}

		const reader = res.body.getReader()
		const decoder = new TextDecoder("utf-8")

		let buffer = ""

		let done = false
		while (!done) {
			const { value, done: readerDone } = await reader.read()
			done = readerDone
			if (value) {
				buffer += decoder.decode(value, { stream: true })

				// Process complete JSON objects in the buffer
				let boundary
				while ((boundary = buffer.indexOf("\n")) !== -1) {
					const chunk = buffer.slice(0, boundary).trim()
					buffer = buffer.slice(boundary + 1)

					if (chunk) {
						try {
							const json = JSON.parse(chunk)

							// Check if "textDelta.value" exists and contains stringified JSON
							if (json.textDelta && json.textDelta.value) {
								let parsedData
								try {
									parsedData = JSON.parse(json.textDelta.value)
									console.log("Parsed JSON from textDelta.value:", parsedData)

									// Handle the "chat_response" field
									if (parsedData.chat_response) {
										setStreamOutput(parsedData.chat_response) // Display the chat response
										updateMessages(parsedData.chat_response) // Update messages
									}

									// Handle the "medical_codes" field
									if (
										parsedData.medical_codes &&
										Array.isArray(parsedData.medical_codes)
									) {
										parsedData.medical_codes.forEach((code) => {
											for (const [key, value] of Object.entries(code)) {
												console.log(`Code: ${key}, Definition: ${value}`)
												addMedicalCode({ code: key, definition: value }) 
											}
										})

										addMedicalCodes(parsedData.medical_codes)
									}
								} catch (e) {
									console.error(
										"Error parsing textDelta.value as JSON:",
										e,
										json.textDelta.value
									)
								}
							}

							if (json.threadCreated) {
								thread.value = json.threadCreated
							} else if (json.messageCreated) {
								addMessage("assistant", "...")
								runId.value = json.messageCreated.run_id
								messageId.value = json.messageCreated.id
								threadId.value = json.messageCreated.thread_id
							} else if (json.toolCallCreated) {
								addToolCall({ type: "created", data: json.toolCallCreated })
							} else if (json.toolCallDelta) {
								addToolCall({ type: "delta", data: json.toolCallDelta })
							} else if (json.textDone) {
								loading.value = false
							} else {
								console.error("Error")
							}
						} catch (e) {
							console.error("Error parsing chunk as JSON:", e, chunk)
						}
					}
				}
			}
		}
	}

Hi!

So I understand your problem correctly, you’ve started parsing the JSON deltas, but now you want to parse a JSON within the JSON deltas, as it comes in?

Have you considered using a JSON stream parser?

here’s a commonly used one (seems to be stable and popular)

I also wrote my own (but it comes with absolutely no warranty)

You would listen to specific fields (in this case, for example the icd_code.code) and then you can pipe them wherever with rxjs or something.

this is a lil simulator, but it would look something like this:

as you can see, objects resolve as they become available

is that what you were looking for?

test object
{
  "patient": {
    "name": "Gerald Quacksworth",
    "dob": "06/12/1985",
    "mrn": "123456789"
  },
  "visit": {
    "date": "01/18/2025",
    "presenting_complaint": "Left arm pain and minor lacerations"
  },
  "history_of_present_illness": [
    "The patient is a 39-year-old male who presented to the clinic following an incident at [...] or unsteady balance."
  ],
  "physical_examination": {
    "general": "Alert, oriented, in no acute distress.",
    "extremities": [
      "Left forearm: 2 cm superficial laceration, no signs of infection or foreign bodies.",
      "Mild erythema noted around the area.",
      "Range of motion intact; no tenderness beyond the immediate site of the laceration."
    ],
    "other_systems": "Within normal limits."
  },
  "assessment": [
    "Superficial laceration of the left forearm due to contact with shrubbery during an animal-related incident."
  ],
  "icd_code": {
    "code": "W61.62XA",
    "description": "Struck by duck, initial encounter"
  },
  "plan": [
    "Clean and dress the wound.",
    "Prescribe topical antibiotic ointment to prevent infection.",
    "Administer tetanus booster as a precaution.",
    "Advise patient to monitor for signs of infection (redness, swelling, fever).",
    "Educate patient on maintaining a safe distance from overly assertive ducks and encourage wearing long sleeves when near the pond."
  ],
  "follow_up": [
    "The patient should return to the clinic in one week for wound evaluation or sooner if there are signs of infection."
  ],
  "physician": {
    "name": "Dr. Mallory Wingate",
    "signature": "M. Wingate, MD"
  }
}