Questions about Assistant, threads


New developer here trying out the Assistant API and looking to implement it in a project of mine.

I have some questions about Assistants and threads.

  1. Reading about Threads i understand it that “context” in this case is kept within the tread. Does this also mean that each following call to the API consumes more and more tokens up to a certain limit where the context is truncated? OR does it save the context in some way that doesn’t include the previous prompts?

  2. If i were to implement Assistant in my project, i’d need it to be able to scale horizontally to multiple users. My thinking would be to create one assistant with one thread to each user. Is this possible and what are the limitations of this? Couldn’t find any answers on google.

  3. Is it possible to “clean” or “empty” threads in code such as in the playground?

Thanks for any answers. Awesome technology, this.



Been working on this same issue all day. Discovered you need 3 different API calls to get the response. 1. Create a Thread: A thread represents a conversation session and is created with an initial set of messages. This is what your createThread function is doing.
2. Run the Thread: Once the thread is created, you perform a ‘run’ on this thread. The ‘run’ involves invoking the Assistant on the Thread, where it processes the messages in the thread and may append new messages (responses from the Assistant). This is handled by your runThread function.
3. Retrieve Messages from the Thread: After the run is completed and the Assistant has potentially added its responses to the Thread, you then fetch the updated list of messages from the Thread. This will include both the original messages and any new messages appended by the Assistant during the run.


Hello Lukas,

Welcome to our community! It’s great to see new developers diving into the Assistant API. I’ll address your questions in a more informal and clear manner:

Context Maintenance in Threads:

Horizontal Scalability with Assistants and Threads:

  • You’re on the right track by creating a unique assistant and thread for each user (OpenAI Platform) . Each thread acts as an independent conversation and can handle a unique user ( API Reference - OpenAI API) . There are rate limits and potential pricing considerations based on usage volume, but planning ahead can help manage them effectively.

Cleaning or Emptying Threads:

  • It seems, based on the official documentation and API references for OpenAI Assistants, that you can edit or modify threads, but currently, the only available parameter for modification is the metadata. API Reference - OpenAI API - Modify Thread

API Reference - OpenAI API - Threads

Instead, you can effectively manage the continuity of the conversation by creating new threads when you need a fresh start or ensuring relevant content continues in future interactions.

    participant User as User
    participant API as Assistants API
    participant Thread as Thread (Conversation)
    participant Message as Message

    User->>API: Create Thread (POST /v1/threads)
    API->>Thread: Thread Initialized
    Thread->>User: Respond with Thread ID

    User->>API: Send Message (POST /v1/threads/{thread_id}/messages)
    API->>Message: Process Message
    Message->>Thread: Add Message to Thread
    Thread->>User: Confirm Message Receipt

    User->>API: Request Response (GET /v1/threads/{thread_id}/messages)
    API->>Thread: Retrieve Last Message
    Thread->>API: Provide Last Message
    API->>User: Display Assistant's Response

I hope this helps shed some light on your questions!



Thank you lots, this diagram and resources help a lot.

I am facing one issue however, and that’s that from reading the documentation, it isn’t clear to me how I create a thread with a specific assistant.

1 Like

You can’t create a thread with a specific assistant, but you can create a thread run with a specific assistant. Basically, you can tell assistant to answer based on provided thread, something like this:

export const createThreadRun = async (threadId, assistantId) => {
  console.log("Creating thread run...");
  return await openai.beta.threads.runs.create(threadId, {
    assistant_id: assistantId,
1 Like

I created a open source repo where I was able to get threads to work. It’s using the Chatbot-UI and doesn’t have the vision or file upload features but the threads works, all you need to do is put your Assistant_ID in the env file with your API key use it. The three API calls are in the server side index.js file if you want to go look.

Good luck my friends!


This is totally unusable without having the ability to remove messages from threads. By the time you guys feel that the thread has reached the context size limit (128k :exploding_head:), we will be broke :slight_smile:


I love the diagram, it help to clarify, thank you. I’m wondering, where do we use runs in all of that?

How do I know weather the uploaded file is saved in thread_id or not?

The uploaded file is saved to the assistant, not the thread. I am not sure if you can check the assistant to see if the file is attached there. You can always update the assistant

Thinking about setting up an AMA with the Assistants team on the discord, seems like that might be a useful thing to do.

2 questions: Is it possible to include image files in assistant creation, I know you can’t through messages yet, and what filetype would it be. And how do you set a max token limit for responses by assistants.

The whole assistants thing has the smell of being outsourced. The multi-faceted obliviousness in creating the thing and nothing being addressed about the major issues in weeks.