Do threads get more expensive over time?

DennyB · March 23, 2024, 10:50pm

If I keep the same thread id, the context grows over time, right? So, does each subsequent prompt to that thread use an ever-increasing number of tokens? My prompts seem to be oddly expensive: I have a code-interpreter assistant that I prompt with ~1300 tokens. Then it generates a file of around 500 tokens. Total should be around 1-2 cents. But then I see my usage jump up 30-40 cents. I am deliberately using the same thread so that I only pay the $.03 code interpreter session fee once (per hour), but maybe I should reset so I’m not dragging along all these historical tokens…?

For context, I don’t really need a historical context - I’m asking for a similar task each time but parameterize differently, I’m not really trying to build out a conversation.

thanks!
-denny-

Diet · March 23, 2024, 10:54pm

Yes!

Not only are you paying for the entire history each time, you’re also paying for any inserted contexts with retrieval, etc.

Have you considered using chat completions instead?

SomebodySysop · March 24, 2024, 12:47am

DennyB · March 25, 2024, 8:45pm

I need to download json data and it looks like only assistants can create files, right? What I want is for a one-off chat to create a file.

_j · March 25, 2024, 10:45pm

Consider: Anything that assistants is doing, when it runs threads against the same AI models as are on chat completions, you can also do. It just takes programming on your part.

For example, the Python notebook with persistent storage, “code interpreter”, is completely accessed by tool calls, with a final printout of values the AI includes in code returned to the AI. The AI writes code, and that python code it writes interacts with or produces files in the mount point of the Python environment (plus an “annotation” feature the AI can write in its responses).

If the “file” is simple text, you can get and employ AI language directly that fits your purpose. However it is the python code that can produce “an image with a bar graph”, or “re-sorted CSV with calculations”, which is a result harder for the AI to get back as a tool return and answer to the user. You thus would write similar mechanisms where the AI’s Python execution returns a success statement, writes files created in a “linky” manner, and this can be a file presented for download in the user interface from the stateful data store where your python execution environment session took place.

Certainly a large bit of coding and environment that has already been done for you by “assistants”. While assistants is general-purpose, you can go beyond its capabilities with your specialization and coding imagination.

Topic		Replies	Views
Token consumption: Prompt tokens exponentially increase when using Threads (Assistants) API assistants-api	8	455	September 5, 2024
Can Instructions be reused at no cost? Or, how to save on tokens API	4	2781	January 1, 2024
Assistant API token Usage - promt_tokens usage is too high API api-usage , assistants , assistants-api	8	1874	April 10, 2024
Will creating more threads help avoid appending the conversation history? API gpt-4 , api	5	1213	December 19, 2023
Seeking Advice on Reducing Costs for RAG Chatbot Using File Search Assistant API api	4	955	July 6, 2024

Do threads get more expensive over time?

Related topics