Questions about Assistant, threads


New developer here trying out the Assistant API and looking to implement it in a project of mine.

I have some questions about Assistants and threads.

  1. Reading about Threads i understand it that “context” in this case is kept within the tread. Does this also mean that each following call to the API consumes more and more tokens up to a certain limit where the context is truncated? OR does it save the context in some way that doesn’t include the previous prompts?

  2. If i were to implement Assistant in my project, i’d need it to be able to scale horizontally to multiple users. My thinking would be to create one assistant with one thread to each user. Is this possible and what are the limitations of this? Couldn’t find any answers on google.

  3. Is it possible to “clean” or “empty” threads in code such as in the playground?

Thanks for any answers. Awesome technology, this.



Been working on this same issue all day. Discovered you need 3 different API calls to get the response. 1. Create a Thread: A thread represents a conversation session and is created with an initial set of messages. This is what your createThread function is doing.
2. Run the Thread: Once the thread is created, you perform a ‘run’ on this thread. The ‘run’ involves invoking the Assistant on the Thread, where it processes the messages in the thread and may append new messages (responses from the Assistant). This is handled by your runThread function.
3. Retrieve Messages from the Thread: After the run is completed and the Assistant has potentially added its responses to the Thread, you then fetch the updated list of messages from the Thread. This will include both the original messages and any new messages appended by the Assistant during the run.


Hello Lukas,

Welcome to our community! It’s great to see new developers diving into the Assistant API. I’ll address your questions in a more informal and clear manner:

Context Maintenance in Threads:

Horizontal Scalability with Assistants and Threads:

  • You’re on the right track by creating a unique assistant and thread for each user (OpenAI Platform) . Each thread acts as an independent conversation and can handle a unique user ( API Reference - OpenAI API) . There are rate limits and potential pricing considerations based on usage volume, but planning ahead can help manage them effectively.

Cleaning or Emptying Threads:

  • It seems, based on the official documentation and API references for OpenAI Assistants, that you can edit or modify threads, but currently, the only available parameter for modification is the metadata. API Reference - OpenAI API - Modify Thread

API Reference - OpenAI API - Threads

Instead, you can effectively manage the continuity of the conversation by creating new threads when you need a fresh start or ensuring relevant content continues in future interactions.

    participant User as User
    participant API as Assistants API
    participant Thread as Thread (Conversation)
    participant Message as Message

    User->>API: Create Thread (POST /v1/threads)
    API->>Thread: Thread Initialized
    Thread->>User: Respond with Thread ID

    User->>API: Send Message (POST /v1/threads/{thread_id}/messages)
    API->>Message: Process Message
    Message->>Thread: Add Message to Thread
    Thread->>User: Confirm Message Receipt

    User->>API: Request Response (GET /v1/threads/{thread_id}/messages)
    API->>Thread: Retrieve Last Message
    Thread->>API: Provide Last Message
    API->>User: Display Assistant's Response

I hope this helps shed some light on your questions!



Thank you lots, this diagram and resources help a lot.

I am facing one issue however, and that’s that from reading the documentation, it isn’t clear to me how I create a thread with a specific assistant.

1 Like

You can’t create a thread with a specific assistant, but you can create a thread run with a specific assistant. Basically, you can tell assistant to answer based on provided thread, something like this:

export const createThreadRun = async (threadId, assistantId) => {
  console.log("Creating thread run...");
  return await openai.beta.threads.runs.create(threadId, {
    assistant_id: assistantId,
1 Like

I created a open source repo where I was able to get threads to work. It’s using the Chatbot-UI and doesn’t have the vision or file upload features but the threads works, all you need to do is put your Assistant_ID in the env file with your API key use it. The three API calls are in the server side index.js file if you want to go look.

Good luck my friends!


This is totally unusable without having the ability to remove messages from threads. By the time you guys feel that the thread has reached the context size limit (128k :exploding_head:), we will be broke :slight_smile:


I love the diagram, it help to clarify, thank you. I’m wondering, where do we use runs in all of that?

How do I know weather the uploaded file is saved in thread_id or not?

The uploaded file is saved to the assistant, not the thread. I am not sure if you can check the assistant to see if the file is attached there. You can always update the assistant

Thinking about setting up an AMA with the Assistants team on the discord, seems like that might be a useful thing to do.

1 Like

2 questions: Is it possible to include image files in assistant creation, I know you can’t through messages yet, and what filetype would it be. And how do you set a max token limit for responses by assistants.

The whole assistants thing has the smell of being outsourced. The multi-faceted obliviousness in creating the thing and nothing being addressed about the major issues in weeks.


Did I miss something or there is no API to get Threads list?
How do I know assistant usage?
I wonder whats happens if somebody steal API key for assistant and will use tokens from account?
There is no way to control expences in that case.
Or in corporate environment: there is no control how employee use access to assistants.

1 Like

You’re right - there is no API to do this! I keep track of the ids of created Threads so I can make sure that they are properly deleted later on!

This would be great - it’s got the potential to be a game-changer!

Screenshot 2024-01-27 094601

you can turn on thread visibility in organization settings
and see all threads


Hi, during code experiments I created a huge number of threads. Do them produce costs ? How can I delete threads ?

They won’t incur cost unless you are running them. You need the thread id to delete them using the delete function. Note that clearing the Run in Playground does not delete them. However, deleting them using delete function will not list them up in the thread page.

By the way, they will be retained up to 60 days then after which will be automatically deleted. Although they noted that they are still evaluating about it.

It is a pity that openai does not have a programmatic way of listing threads. I had to devise a whole new way of using assistants to list threads of specific types. The demo is from within the betaassi framework; as I need it to do streaming with the Assistant API.

SPOILER ALERT: This way of enablement is fairly limiting. YMMV. But someone might find it helpful. Also the streaming is with Assistant API; but under the hoofd it makes use of chat completion. This streaming is NOT part of the framework. Just core components are.

Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from openai_session_handler.models.threads.multithread_basethread import MultiThreadBaseThread
>>> MultiThreadBaseThread.list()
>>> MultiThreadBaseThread.create()
MultiThreadBaseThread(id='thread_Ayzpqw7gaGCWaDrN83YPspUm', object='thread', created_at=1708489530, metadata={'pub_thread': 'thread_qq96g4sDYSV71L34PSL483bq', 'thread_type': 'MultiThreadBaseThread'}, thread_type='MultiThreadBaseThread', pub_thread='thread_qq96g4sDYSV71L34PSL483bq')
>>> MultiThreadBaseThread.list()
>>> MultiThreadBaseThread.create()
MultiThreadBaseThread(id='thread_AMvRbkpR9L1LqFuv0YGwwgiT', object='thread', created_at=1708489570, metadata={'pub_thread': 'thread_6TD04iWFtIYpc8H4bK1jYbkb', 'thread_type': 'MultiThreadBaseThread'}, thread_type='MultiThreadBaseThread', pub_thread='thread_6TD04iWFtIYpc8H4bK1jYbkb')
>>> MultiThreadBaseThread.list()
[Thread(thread_id='thread_Ayzpqw7gaGCWaDrN83YPspUm'), Thread(thread_id='thread_AMvRbkpR9L1LqFuv0YGwwgiT')]
>>> MultiThreadBaseThread.delete('thread_Ayzpqw7gaGCWaDrN83YPspUm')
ThreadDeleted(id='thread_Ayzpqw7gaGCWaDrN83YPspUm', deleted=True, object='thread.deleted')
>>> MultiThreadBaseThread.list()
>>> MultiThreadBaseThread.delete('thread_AMvRbkpR9L1LqFuv0YGwwgiT')
ThreadDeleted(id='thread_AMvRbkpR9L1LqFuv0YGwwgiT', deleted=True, object='thread.deleted')
>>> MultiThreadBaseThread.list()

For the interested, here the github link (GitHub - icdev2dev/betaassi: "betaassi: An innovative Python package for seamless integration and management of OpenAI sessions, designed to enhance developer experience with asynchronous server support and extendable classes).