Question about Assistant Pricing

  1. How will the retrieval file stored (mb) amount affect the pricing?

  2. Token count when input/output

1 Like

The general consensus is Assistant pricing trends to be very high and entirely unpredictable.

The retrieval seems to greedily fill the context window so input tokens are often at or near the maximum for the model you are using.

Because the assistant can call tools and may decide it needs to do so multiple times, this can quickly spiral.

@_j has much more knowledge about this than I do, hopefully he can chime in and give you some more practical tips.

But, I think his general advice would be to not use the Assistants API unless at least 2 of the 3 of these apply to you,

  1. You know you absolutely need to in order to achieve a very specific task
  2. You have an obscene budget and don’t care about cost
  3. Burning money is your favourite hobby

That said, I think there are some ways to rein in spending on assistants, I’m just not terribly well-versed with respect to using that endpoint myself.


$0.20 / GB / assistant / day (free until 02/01/2024)

(Some people believe it could potentially be unknowable) * ($0.01) + (who knows) * ($0.03) + ($0.03) * (maybe)

1 Like

It should be possible to use the Assistant Api for the tools and callbacks but manage the context entirely on your own. I am still thinking through pros/cons of this approach over the completion api but putting it out there to see if others have gone down this path.


Definitely a constructive approach!
The issue with the retrieval is that it keeps filling up the context window and if one reaches the 128K token limit for GPT-4, that’s expensive.

If course you can disable retrieval and provide your own conversation history just as before but with the advantage of using the code interpreter, if that’s what you are after.