Streaming Markdown or Other Formatted Text

Folks,

I am wondering if anyone has some recommended libraries or practices for rendering streamed content in real-time, as it is returned by the API, such that lists, bold, italics, etc. are rendered as such.

ChatGPT’s UX itself does this (just ask it to return a list of anything), and it’s really trivial for non-streaming content, but streaming formatted text content doesn’t seem to be that common. I have a few workarounds and prototypes, but nothing I feel is very robust.

We’re using Node on the backend and Vue3 on the front-end, so bonus for anyone with libraries in those ecosystems they can point me at, but frankly any recommended practices, libraries, or SDKs for streaming formatted text are welcome. Also happy to collaborate on an existing or new open-source project for this purpose - seems like a need exists.

Thanks!

-Tim

4 Likes

not sure
i am playing with a md to html javascript lib
marked js org parser

to parse the chat stream string
only have a problem when in stream is html code

Hey @TimJohns, did you finally find the solution for this? I am facing a similar issue where the markdown formatted text won’t come along with streaming no matter what we do to the system or user prompt. And as far as I could see, there isn’t anything else we could do as of now rather than fiddling with the prompt text.

I didn’t find a truly streaming solution, but the workaround I implemented was to re-render and sanitize the entire contents of the completion on each incoming chunk. It’s a good workaround, because the code is simple and it looks great MOST of the time, but there are a couple of issues:

  • If the markdown spans multiple chunks, it will render as undecorated text briefly before the next chunk comes in. This bothers ME, visually, but no users have complained about it, so my assumption is for most users it’s probably a minor annoyance that doesn’t rise to the level of complaint. We can tolerate that temporarily.

  • This approach is computationally inefficient. That said, we haven’t had any complains about using too much CPU in the users’ browsers and in our own profiling, it’s immaterial. That said, the obviously inefficient implementation is a distraction in the CODE - pretty much every developer who’s looked at it has tried to ‘fix’ it with no luck so far.

  • Similarly, obviously rendering partial markdown as HTML is distracting our secure coding folks. While none (so far) have found a vulnerability, we’ve spent a lot of time analyzing it. Everyone we’ve had look at it has ultimately determined they’re comfortable with it – but just like the inefficiency issue, it’s a distraction for pretty much anyone who looks at it. Parsing partial stuff is the kind of general pattern that can expose injection and over/under run vulnerabilities.

Here’s a stripped-down version of our Vue3 component. Apropos to the above commentary, v-sanitize on line 4 is a reference to vue-sanitize-directive, and (Vue-specific) summaryMarkdown is re-computed every time events.onmessage updates summary.value.

EDIT: markdown-it is also doing a lot of the heavy lifting here.

<template>
  <div class="card w-100">
    <div class="card-body w-100">
      <span v-sanitize="summaryMarkdown"></span>
    </div>
  </div>
</template>

<script setup lang="ts">
import { computed, ref, watch } from 'vue';
import MarkdownIt from "markdown-it";
import { base64Decode } from "./utils";

const props = defineProps<{
  streamUrl: string | undefined,
}>();

const summary = ref<string>('');
const isStreaming = ref();

const markdown = new MarkdownIt();
const summaryMarkdown = computed(() => {
  if (!summary.value) return;
  return markdown.render(summary.value);
})

function startStream(streamUrl: string) {   
  isStreaming.value = true; 
  summary.value = '';

  const events = new EventSource(streamUrl);

  events.onmessage = (event) => {
    summary.value = summary.value.concat(base64Decode(event.data));
  };

  events.onerror = (event) => {
    console.error(`EventSource.onerror: ${JSON.stringify(event, null, 2)}`);
  }
  events.addEventListener("control", (event) => {
    const control = JSON.parse(atob(event.data));
    console.log(`Incoming control command: ${JSON.stringify(control, null, 2)}`);
    switch (control.command) {
      case 'done':
        events.close();
        break;
      default:
        console.log('Unknown SSE command.');
    }
  })
};


watch(() => props.streamUrl, () => {
  if (props.streamUrl) {
    startStream(props.streamUrl);
  }
}, {immediate: true});

</script>
3 Likes

Thanks @TimJohns for the detailed response! :>

Thanks a ton for this @TimJohns

Would this be possible in just vanilla javascript? or, how can we do this by referencing vue or other utils libraries etc.?

May i request you to please provide a full example? It will help me a lot

Thanks

As I also needed one and didn’t find anything suitable, I tried to do one by myself. If you or anyone is brave enough to test it :wink:
get in github - search for StreamMdProcessor

1 Like

Thanks for you contributing, it worked for me ! with a bit of arrangement.

Keep going is a great code !

1 Like

Might be a little late for a reply, but posting here in case anyone else runs into the issue.

This is the frontend JS code I used in my chatbot app for processing the streamed response from a node.js express application.

// Process the streaming response from the assistant
async function processAssistantResponse(response) {
    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    let done = false;

    // Create a new message div for the assistant's response
    const assistantMessageDiv = createAssistantMessageDiv();

    let buffer = '';

    while (!done) {
        const { value, done: readerDone } = await reader.read();
        done = readerDone;

        if (value) {
            const chunk = decoder.decode(value);
            buffer += chunk;

            // Process the buffer for SSE messages
            let lines = buffer.split('\n');

            // Process all complete lines except possibly incomplete last line
            for (let i = 0; i < lines.length - 1; i++) {
                const line = lines[i].trim();
                updateAssistantMessage(assistantMessageDiv, line);
            }
            // Keep the last line in buffer (in case it's incomplete)
            buffer = lines[lines.length - 1];
        }
    }
}

// Update the assistant's message div with new content
function updateAssistantMessage(messageDiv, message) {
    // Format the message content using the updated formatMessage function
    const formattedMessage = formatMessage(message);

    messageDiv.innerHTML += formattedMessage;

    // Scroll to the bottom of the chatbox
    const chatBox = document.getElementById('chat-box');
    chatBox.scrollTop = chatBox.scrollHeight;
}

// Function to convert markdown-like syntax into HTML
function formatMessage(message) {
    // Convert Markdown to HTML
    const rawHTML = marked.parse(message);
    // Sanitize the HTML
    const sanitizedHTML = DOMPurify.sanitize(rawHTML);

    return sanitizedHTML;
}

To this you need to add the following to your HTML to download the libraries needed for the formatMessage() function:

    <!-- Include Marked and DOMPurify via CDN before your main script -->
    <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/dompurify/dist/purify.min.js"></script>

Hope this saves you some time.

2 Likes

You need an incremental parser like Lezer or TreeSitter, something that is used for constructing ASTs for text / code editors (streaming in text is nearly the same as a user typing at the end of a doc).

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.