Streaming markdown text and images from assistant using code interpreter

Okay, now I’m really confused. At first, I thought it was me but I went back to the original streaming code to start over and discovered that the documentation is flawed. The key to success appears to be in the event handler. The documentation shows this code sample for streaming:

from typing_extensions import override
from openai import AssistantEventHandler, OpenAI
 
client = OpenAI()
 
class EventHandler(AssistantEventHandler):
    @override
    def on_text_created(self, text) -> None:
        print(f"\nassistant > ", end="", flush=True)

    @override
    def on_tool_call_created(self, tool_call):
        print(f"\nassistant > {tool_call.type}\n", flush=True)

    @override
    def on_message_done(self, message) -> None:
        # print a citation to the file searched
        message_content = message.content[0].text
        annotations = message_content.annotations
        citations = []
        for index, annotation in enumerate(annotations):
            message_content.value = message_content.value.replace(
                annotation.text, f"[{index}]"
            )
            if file_citation := getattr(annotation, "file_citation", None):
                cited_file = client.files.retrieve(file_citation.file_id)
                citations.append(f"[{index}] {cited_file.filename}")

        print(message_content.value)
        print("\n".join(citations))


# Then, we use the stream SDK helper
# with the EventHandler class to create the Run
# and stream the response.

with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="Please address the user as Jane Doe. The user has a premium account.",
    event_handler=EventHandler(),
) as stream:
    stream.until_done()

The problem, however, is that this code doesn’t actually stream. I know the documentation has a reputation of being bad but I had no idea it would rise to this level.

After much playing around I arrived at this solution for the event handler to actually stream the output:

class EventHandler(AssistantEventHandler):
    """Custom event handler for processing assistant events."""

    def __init__(self):
        super().__init__()
        self.results = []  # Initialize the results list

    @override
    def on_text_created(self, text) -> None:
        """Handle the event when text is first created."""
        # Print the created text to the console
        print("\nassistant text > ", end="", flush=True)
        # Append the created text to the results list
        self.results.append(text)

    @override
    def on_text_delta(self, delta, snapshot):
        """Handle the event when there is a text delta (partial text)."""
        # Print the delta value (partial text) to the console
        print(delta.value, end="", flush=True)
        # Append the delta value to the results list
        self.results.append(delta.value)

    def on_tool_call_created(self, tool_call):
        """Handle the event when a tool call is created."""
        # Print the type of the tool call to the console
        print(f"\nassistant tool > {tool_call.type}\n", flush=True)

    def on_tool_call_delta(self, delta, snapshot):
        """Handle the event when there is a delta (update) in a tool call."""
        if delta.type == 'code_interpreter':
            # Check if there is an input in the code interpreter delta
            if delta.code_interpreter.input:
                # Print the input to the console
                print(delta.code_interpreter.input, end="", flush=True)
                # Append the input to the results list
                self.results.append(delta.code_interpreter.input)
            # Check if there are outputs in the code interpreter delta
            if delta.code_interpreter.outputs:
                # Print a label for outputs to the console
                print("\n\noutput >", flush=True)
                # Iterate over each output and handle logs specifically
                for output in delta.code_interpreter.outputs or []:
                    if output.type == "logs":
                        # Print the logs to the console
                        print(f"\n{output.logs}", flush=True)
                        # Append the logs to the results list
                        self.results.append(output.logs)

# Using our first assistant
with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id,
    event_handler=EventHandler(),
) as stream:
    stream.until_done()

Which gives me this result:

assistant tool > file_search

assistant text > The main characters in “The Wonderful Wizard of Oz” are:

  1. Dorothy - A young girl from Kansas who is transported to the Land of Oz by a cyclone【10:6†source】.
  2. Toto - Dorothy’s small dog who accompanies her on her journey【10:8†source】.
  3. Scarecrow - A character Dorothy meets who desires to have brains【10:5†source】.
  4. Tin Woodman - Another companion of Dorothy who wishes for a heart【10:5†source】.
  5. Cowardly Lion - A lion who joins Dorothy in hopes of gaining courage【10:5†source】.
  6. The Wizard of Oz - The ruler of the Emerald City who the characters believe can grant their wishes【10:5†source】.

Additionally, there are other notable characters:

  • Glinda, the Good Witch of the South - Who helps Dorothy and her friends【10:16†source】.
  • The Wicked Witch of the West - One of the main antagonists in the story【10:12†source】【10:13†source】.

Now the adventure begins. I will keep digging and find out how to deal with the annotations properly and post back here. My theory is that this will lead to the answer for the original question as, apparently, code interpreter file information is given as an annotation. We will see. Stay tuned :slight_smile:

1 Like