Assistants API Retrieval Pricing: how much does this cost?

Thank you. I wasn’t allowed to write more replies yesterday by the system. I’ll try this. Nothing suggests there’s anything hacky about it or unintended use. It’s just using the file uploads well within the 100GB limit per organization and then passing one of those files to each session.

1 Like

Hi Fyodor,
Did you have any updates on this?
I also want to build the assistant for each users of my 10K DAU application.

The way I understood this: You can attach 20 files to one assistant.
OR you can attach files to a thread. If you create one thread per user, then each thread will be seperated from another one. Thread A cannot access the files you added to thread B.

The thread has initially no connection to the assistant. Only if you RUN an assistant on a thread, there will be activity. So, you can run different assistants on one thread. Or 40k threads on one assistant.

Anyways, your problem of seperating files in different sessions so a second user cannot access them, should be covered by that. If I understood it correctly.

That’s not for adding files. It’s just for referencing which already-uploaded files should be used for the message.

Yes, it’s for adding files. Yes, the upload itself is another request that you have to do beforehand. But its not impossible, just do it before the message call. File IDs is the way for handling files. I can’t see why this should be a problem for your use case. If someone adds a file: make 2 calls, one for uploading it, one for adding the File ID to the message.

Could you explain this more? The metadata piece? The business logic you speak of? Matching user to assistant?

Your threads (which are attached to a single user / conversation) can contain metadata.

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.

If you want to attach specific/private documents to specific people then making an assistant for each person is not the solution.

Even if it was, the procedure is essentially the same. You need to get/authenticate the user, and then find the appropriate assistant/file to attach to them (business logic).

Once you have performed this task you can set the metadata to identify the user for future reference

async function main() {
  const updatedThread = await openai.beta.threads.update(
      metadata: { modified: "true", user: "abc123" },


No official answer yet. If you figure it out please let me know.

take a look at this and let me know if it helps (I am trying to do a similar thing to you, but with 60):

First of your calculation seems wrong
you need 40k unique assistants, suppose one assistant has one unique file then you have to pay 0.20 per day per assistant, so it (40000*0.20) becomes 8000 per day for all 40k assistants. Their 0.20 cost is if we multiply with 30 days becomes 6
So basically Open AI will charge $6 per assistant for every month, beside other costs like threads & messages will be also added to this.
At the moment it is way too costly, whereas using embedding you can build solutions at a lesser cost.

Fyodor’s question relates to the pricing structure for the Assistants API, specifically the retrieval part, for a scenario involving 40,000 assistants, each with a single text file of approximately 100kb.

Based on OpenAI’s pricing structure of $0.20 per GB per assistant per day, the calculation for the total daily cost of the retrieval service for 40,000 assistants would be as follows:

  1. First, calculate the total data size for all assistants. Since each assistant has a 100kb file, for 40,000 assistants, the total data size would be 40,000×100kb or 40,000×0.0001GB (since 1GB = 1,000,000kb).
    40,000×100 kb
    40,000×0.0001 GB
  1. Multiply this total data size by the cost per GB per assistant per day ($0.20).

Let’s do the calculation:

The total daily cost for the retrieval part of the Assistants API for 40,000 assistants, each with a single 100kb text file, would indeed be $0.8 per day. Fyodor’s calculation is correct. Each assistant incurs a cost of 0.20x0.0001 = $0.00002 per day, and for 40,000 assistants, it amounts to 40,000 x0.20x0.0001 = $0.8 per day.

Assistant Retrieval API with or without Threading and its cost effect?
I am in middle of situation where Retrieval method for file and its pricing, if i use new assistant every time which means creating new thread always, would it cost me more or using same thread which saved token (which i really dont need)would cost me more, ?
For me the retrieval method works really good with assistant than pinecone, I have one big file with unique numbers and data which AI needs to pick WRT some situation.

Out of curiosity, is this data indexed and searchable? Is it structured or unstructured?

Just to be on the same page: you would use the same assistants, but create a new thread. Yes, creating a new thread starts a new conversation and resets the context to 0.