What is the recommended way to add context to the assistant?

As an experiment, I’m creating a Chat-Box to possibly help me.

To be useful, this assistant must first read the report on the page.
So I wonder how to include this information.

Among the options, we have:

  • Use a ‘get_issue’ tool?
  • Include the extra information as a file?
  • Add a hidden message?
  • Edit the initial prompt to include the report?

In the first attempts, I let the assistant use a ‘get_issue’ tool.
But I later noticed that, after the tool’s response, the assistant evaluates the entire System Prompt again, which leads to longer waiting times and costs.

So I am currently editing the System Prompt including the report when creating the Run object.

But I’m still not sure which is more appropriate.

System Prompt You are an Assistant in a chat box on the official {removed} Bug Tracker. You are chatting with {removed} about the issue that was reported by someone else. THIS {removed} IS NOT THE PERSON WHO REPORTED THE ISSUE.

Your primary objectives are to:

  1. provide reply suggestions when solicited or
  2. provide information.

General instructions:

  • Keep communication brief and focused.
  • When checking a GPU, be clear about why it is supported or not.

Instructions for the suggested reply when suggesting a replies:

  • Suggest reply to last message in the issue (NOT THE ONE IN THE CHATBOX).
  • Do not use expressions like: Dear, Best regards.
  • For glitches, rendering issues or when {removed} doesn’t open, use the ‘check_gpu’ tool to determine if the GPU is supported.
  • In the suggested reply:
    a. Impersonate the {removed}.
    b. Thank the author for the report.
    c. Provide solutions if possible.
    d. If the nature of the report is unclear (whether it’s a bug, feature request, or assistance request), request additional information for clarification.
    e. If confirmed a bug, mention that the report will be forwarded to the developers for further inspection.
    f. If the report is a Feature Requests, redirect the user to the appropriate channels for user feedback and feature requests: {removed}
    g. If the report is a request for information or help, redirect the user to seek general assistance on the {removed} community sites: {removed}

Hi, you have a lot of ideas there, some are not really practical. Some of the more natural techniques are not available in assistant methods.

I believe you are on the best path of those that are offered - instructions (which is equivalent to, but not exactly the same thing as, “system message”.)

  • You need the AI to always know certain information;

  • You cannot rely on document retrieval, as it is based on user input and the report may be displaced by other files when loading the AI context;

  • You cannot place messages in any other role than “user”, and user messages may expire from chat history at unknown point in time;

  • adding continued information injection via user messages will place them in threads, blowing up the size quickly.

Therefore, the one persistent place without duplication is in instructions.

You can decide if per-assistant is what you want, or if it should be the instruction that overrides assistant programming per-run.

Perhaps OpenAI would see the light of a “documentation” message role. One specifically for automated knowledge injection, recognized via training as being useful for answering the latest question. One an assistant doesn’t need beyond a run. This works well in my own -instruct chatbot employing emulation of ChatML.


Glad to see that editing the instructions is the currently offered path that most qualifies the aforementioned objective: having persistent information in the thread.

In fact, having other role options such as “documentation”, seems to be a good addition, however I think this could fall into the same limitation of information included through tools, which is the waiting time to process the existing context.

Another option I thought of would be to have a lightweight assistant (or instruction) that is a manager that chooses another assistant (or instruction) depending on the first message.

The AI is actually very fast at processing initial input. You can feel free to give the AI 1k or 8k of documentation context as your budget allows.

If you stream via chat completions, the time to get the first token back has very little penalty based on how much input context has been placed. The efficiency of attention layers prevent computation from going sky-high on long context length.

Do you have a write up of how you configured this?

Somewhat novel and somewhat obvious:

I explored the idea pretty far, but not as far as a seamless drop-in to replace chatcompletion code, and new python library would make a rewrite necessary.

I might explore again what the endpoint will encode as token strings, as another leaked model gave full special token access for the day it was up.

“Projects and applications” posts on this forum are ignored, so I kind of gave up the idea of offering utilities and use-cases here.

1 Like

Im just offereing my two cents on how i did it. Currently i have a table with 3 sections. Name, userID, threadID. I have a function that can check this table and it will search by userId to try and find an existing thread and if it doesn’t exist then it will create a new thread and tie that thread id to the user id as well as store in the table to be access to maintain context in a thread. And the thread is key to maintaining context. Here is sample code

` def get_or_create_thread_id(user_id, sender_name):

existing_user = app_tables.thread.get(UserID=user_id)
if existing_user:
    return existing_user['ThreadID']
    thread = client.beta.threads.create()
    thread_id = thread.id
    app_tables.thread.add_row(Name=sender_name, UserID=user_id, ThreadID=thread_id)
    return thread_id`