Assistant vs API pricing for non retrieval normal conversation

Hey y’all,

I’m building with the assistant API not using the retrieval function. I am simply attaching an object to the prompt which holds data that needs to be analyzed from a MongoDB since its not much and is about a user, and I was wondering if the normal API is cheaper or if assistant is cheaper? Since the normal API you had to send the whole conversation back to chatgpt to it “remembers” it would really start adding up. But for assistant it remembers context via a thread, I haven’t seen much about how much is assistant charged for something like non retrieval but rather a normal conversation.


It is Assistants that is much more expensive.

It is the backend of Assistants that sends chat to the model each time, up to the context length limit, and you do not have control over the limit or length sent.

But with the normal assistant we also have very little to no control about the length of the context right? Assistant smartly “truncates” the context length so in a way wouldn’t that make it cheaper I guess? So across both the token cost is the same?

The intelligence is “maximum expense”. That is pretty clearly laid out in the assistants documentation and evidenced in use.

With Chat Completions and self-management, you or the user can make the exact decision you want between quality and budget.

I just want to use assistant for a back and forth conversation, no function calling or retrieval. Would it be cheaper to just use the normal API which sends back the whole conversation?

Yes, it would be cheaper on chat completions if you want it to be, because it is only chat completions that DOESN’T send the whole context to a model each user input - because you can limit the number of past turns to just what you want. You can even have a button on each message in your user interface to delete older messages or disable them from sending, or show a slider bar that greys them out from sending automatically after a token budget of sending again is reached.