I am wondering if anyone has some recommended libraries or practices for rendering streamed content in real-time, as it is returned by the API, such that lists, bold, italics, etc. are rendered as such.
ChatGPT’s UX itself does this (just ask it to return a list of anything), and it’s really trivial for non-streaming content, but streaming formatted text content doesn’t seem to be that common. I have a few workarounds and prototypes, but nothing I feel is very robust.
We’re using Node on the backend and Vue3 on the front-end, so bonus for anyone with libraries in those ecosystems they can point me at, but frankly any recommended practices, libraries, or SDKs for streaming formatted text are welcome. Also happy to collaborate on an existing or new open-source project for this purpose - seems like a need exists.
Hey @TimJohns, did you finally find the solution for this? I am facing a similar issue where the markdown formatted text won’t come along with streaming no matter what we do to the system or user prompt. And as far as I could see, there isn’t anything else we could do as of now rather than fiddling with the prompt text.
I didn’t find a truly streaming solution, but the workaround I implemented was to re-render and sanitize the entire contents of the completion on each incoming chunk. It’s a good workaround, because the code is simple and it looks great MOST of the time, but there are a couple of issues:
If the markdown spans multiple chunks, it will render as undecorated text briefly before the next chunk comes in. This bothers ME, visually, but no users have complained about it, so my assumption is for most users it’s probably a minor annoyance that doesn’t rise to the level of complaint. We can tolerate that temporarily.
This approach is computationally inefficient. That said, we haven’t had any complains about using too much CPU in the users’ browsers and in our own profiling, it’s immaterial. That said, the obviously inefficient implementation is a distraction in the CODE - pretty much every developer who’s looked at it has tried to ‘fix’ it with no luck so far.
Similarly, obviously rendering partial markdown as HTML is distracting our secure coding folks. While none (so far) have found a vulnerability, we’ve spent a lot of time analyzing it. Everyone we’ve had look at it has ultimately determined they’re comfortable with it – but just like the inefficiency issue, it’s a distraction for pretty much anyone who looks at it. Parsing partial stuff is the kind of general pattern that can expose injection and over/under run vulnerabilities.
Here’s a stripped-down version of our Vue3 component. Apropos to the above commentary, v-sanitize on line 4 is a reference to vue-sanitize-directive, and (Vue-specific) summaryMarkdown is re-computed every time events.onmessage updates summary.value.
EDIT: markdown-it is also doing a lot of the heavy lifting here.
As I also needed one and didn’t find anything suitable, I tried to do one by myself. If you or anyone is brave enough to test it
get in github - search for StreamMdProcessor
Might be a little late for a reply, but posting here in case anyone else runs into the issue.
This is the frontend JS code I used in my chatbot app for processing the streamed response from a node.js express application.
// Process the streaming response from the assistant
async function processAssistantResponse(response) {
const reader = response.body.getReader();
const decoder = new TextDecoder();
let done = false;
// Create a new message div for the assistant's response
const assistantMessageDiv = createAssistantMessageDiv();
let buffer = '';
while (!done) {
const { value, done: readerDone } = await reader.read();
done = readerDone;
if (value) {
const chunk = decoder.decode(value);
buffer += chunk;
// Process the buffer for SSE messages
let lines = buffer.split('\n');
// Process all complete lines except possibly incomplete last line
for (let i = 0; i < lines.length - 1; i++) {
const line = lines[i].trim();
updateAssistantMessage(assistantMessageDiv, line);
}
// Keep the last line in buffer (in case it's incomplete)
buffer = lines[lines.length - 1];
}
}
}
// Update the assistant's message div with new content
function updateAssistantMessage(messageDiv, message) {
// Format the message content using the updated formatMessage function
const formattedMessage = formatMessage(message);
messageDiv.innerHTML += formattedMessage;
// Scroll to the bottom of the chatbox
const chatBox = document.getElementById('chat-box');
chatBox.scrollTop = chatBox.scrollHeight;
}
// Function to convert markdown-like syntax into HTML
function formatMessage(message) {
// Convert Markdown to HTML
const rawHTML = marked.parse(message);
// Sanitize the HTML
const sanitizedHTML = DOMPurify.sanitize(rawHTML);
return sanitizedHTML;
}
To this you need to add the following to your HTML to download the libraries needed for the formatMessage() function:
<!-- Include Marked and DOMPurify via CDN before your main script -->
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/dompurify/dist/purify.min.js"></script>
You need an incremental parser like Lezer or TreeSitter, something that is used for constructing ASTs for text / code editors (streaming in text is nearly the same as a user typing at the end of a doc).