Assistants API pricing details per message

According to the pricing page, Retrieval for Assistants API should be free until 11/17. Still we see very high costs that probably are generated of the large Retrieval files used.
Does anyone know if this “free until…” is considered or not?

Tool Input
Code interpreter $0.03 / session (free until 11/17/2023)
Retrieval $0.20 / GB / assistant / day (free until 11/17/2023)

The high cost you are seeing is for API processing of tokens for the retrieved files. If you look at the logs, probably hundred thousands accumulated. In my understanding, maybe incorrect, is that $0.20/GB/assistant/day is when they pull the file from their storage before processing.

So, is that $0.20/GB/assistant/day just to store your files, or $0.20/GB/assistant/day each time you use one or more assistants to retrieve data from your files – as in when you ask questions related to the files.

It is under Retrieval so I think it is only when you retrieve the file. But I am not sure so maybe someone can clarify this.

1 Like

I’m also interested if it’s charged if unused. I don’t even see the usage report in the dashboard. I know it’s free for now but I figured there’d be a spot for it. There’s one for CI sessions.

My bet is that it’s only if the assistant is used :moneybag:

Guess we’ll have to wait for the 18th :man_shrugging:

The retrieval fee is per GB just sitting there. Multiplied by the number of assistants they are connected to.

The cost even when free is when you have the black box backend fill up the GPT-4 context with 100k tokens of your document to the tune of $1 per question if it only answers the question and doesn’t instead call code interpreter for you or a function.

So you are saying that if you have 100GB of files uploaded and one assistant, the cost would be $2 per day with no activity whatsoever?

That’s the weird thing. We have free 100gb storage of files but an assistant is limited to 10(?) x 512MB files and we’re charged for storage

The document is embedded into a vector database and the questions asked are also embedded by AI.

Honest pricing would be something like $10 per gig when you upload it (within a few orders of magnitude).

1 Like

Good point. So then I’m only left wondering if a message attached text file is embedded at all? Or if it’s embedded and discarded?

I don’t know what a message attached text file is.

If it is user input, it is retained until the model runs out of context length. So you paste 15000 tokens into your prompt, you are now guaranteed to be paying $0.15 more every question, and continuing to increase up to $1.25+ (128k) until you “new chat”.

If it is an uploaded document, then you also get the AI context maxed with that immediately.

With assistants we can link Files that don’t necessarily need to be attached to the assistant in the message object. They can be used for retrieval.

Yes, as I described just before. Either you get the whole document, or the most document that will fit after a semantic search on chunks that were embedded. In a normal vector database client, you would have a similarity threshold, here it doesn’t matter if your talking about elephants or moon landings, you get maxed.

1 Like

If I understand you correctly you are saying that the file gets completely expanded into the prompt as context, not embedded.

I’m sorry. I’m trying to understand the difference between uploading to assistant vs uploading to message

If you were to make a chatbot that had code interpreter, then you could also have an “upload” that would upload to the sandbox. You are dinged $0.03 every conversation that has code interpreter enabled, a haircut right off the top before the AI writes code to extract your PDF and get it back as an in-context function return that becomes chat history or what have you.

Hello,
My understanding looking at the pricing on the website is that using Assistant API is free until the 13th of December. Could someone please correct me if I am wrong?

1 Like

thanks for the heads up. it seems they updated it. it was supposed to end last week. i deleted all my uploaded files for assistant prior to that.

Just the retrieval storage and add-on code interpreter is free. You pay for the excessive tokens and iterations that those use in operation.

1 Like

Thanks for the clarification. Can you buy a certain amount of tokens, or are they billed as you consume them? I am not sure how to place a limit on the tokens I use or even to estimate how many I need for a certain application.
Thanks for your help!

The pricing of an assistant is somewhat out of control at this point. A “thread” conversation can go without limit until they fill the model context with conversation. The files retrieval is also not specific and only limited to the best matches, another promise that they fill the model context with data. So that can mean 100k of token information automatically loaded up into the new gpt-4-turbo: $1+ per run step. Then if your API also has enabled function-calling, or has enabled code interpreter, the AI can carry that context with it for multiple step iterations on those tools.

The only “safe” way to use it is with gpt-3.5-turbo-0613 with lower prices and a maximum 4k that you’ll be billed per iteration.

4 Likes