Surface cost for a thread run in the Assistants API

dhruv.anand · November 27, 2023, 7:29pm

since the pricing model is quite complex, and its not clear what was sent to the model for a particular thread run (system message+message history+retrieved content chunks), it’d be useful to surface how much the thread run costed (in tokens and $).

Thanks

Foxalabs · November 27, 2023, 8:07pm

Hi and welcome to the Developer Forum!

This seems to be a feature that is being built or at the very least discussed

moonlockwood · November 27, 2023, 8:11pm

Details of usage would be great, would help with troubleshooting a lot. Thanks!

_j · November 27, 2023, 8:25pm

“usage tracking based on API key”?

How about the input and output tokens consumed by every single step, placed right in the run object or step objects as metadata, with retrieval of the full AI input and generation of every step available.

Then a ledger download in usage with every single API call made by an assistant, with all associated IDs.

Or would the truth be dangerous to the entire project?

Foxalabs · November 27, 2023, 8:28pm

You wanted an API level token tracking system, and that is what is being discussed. I feel that you are moving the goalposts. I do not see any intentional hiding of information, however you are free to have that opinion.

_j · November 28, 2023, 1:10am

I wanted nothing of the sort.

“Would you be excited / would it be useful if we released usage tracking based on API key for the API?”

That is not useful, except for those that mistakenly think that OpenAI should be tracking their ten customer’s usage for them, or think OpenAI should be running ChatGPT for them in an iframe but they get a cut of the action.

OpenAI have obviously and with their own motivation made “usage” where you can only see use by model, and you can only see use by an entire day. And released it the same hour of devday as “assistants”. Giving me one “usage” per API key adds nothing, except I could make one call per day per key.

What IS useful is immediately seeing what is actually being consumed by an assistant run in real time by every step.

You make your own code replacement for every single feature of “assistant” (including its feature of not streaming), and you get the input and output token statistics of every single API call, and can log and diagnose the AI inputs and responses every iteration loop. You would see when one chat is totaling over $20.

OpenAI isn’t even forthcoming in describing how retrieval works, with their own tokens of “myfiles_browser” functions injected and iterating on scrolling through documents at your expense.

dhruv.anand · November 28, 2023, 8:22am

The question is specifically about Assistants API cost for a particular run.
This is separate from per API key token consumption tracking

Foxalabs · November 28, 2023, 11:04am

For me, I make sure I have not used the model I want to test that day, then I run my test and stop, then I wait for the API usage page to catch up, for me it’s usually about 10-15 mins and then I can see the token count used for that session.

paul.edwards · December 1, 2023, 9:23am

For me, understanding the cost incurred by a conversation would be incredibly useful. I would think this means that the cost breakdown should live at the run object level.

Of course, if it was only available for each iteration of running a thread (i.e. becomes complex when the thread ‘requires action’), we could always have the cost for that individual thread run and copy that up into the metadata of the run object.

But all of this breeds complexity, it would be best if the answer was just in the run object.

However, the run objects are volatile (once ended) I think so you would need to monitor them and persist the results somewhere else.

vladvedinas · December 1, 2023, 9:36am

In the current state of Assistants API it is usable for a quick prototype test. Integrating this into an application is not feasible because you don’t have control over the costs, history, data retrieval, and tools used.
For more complex use cases it is better to replicate the functionality within your application.

Topic		Replies	Views
Assistant API tokens usage API api-usage , assistants-api	9	1849	November 14, 2024
Do Assistant-called function outputs count towards input tokens? API	8	2060	January 12, 2024
OpenAI team, thanks for the work you’re doing Community chatgpt	1	344	June 4, 2024
Impact of Instruction Size and Thread Length on Token Usage in OpenAI Assistant API api , cost	8	3898	May 21, 2024
How to count TOKEN in a thread with API API assistants-api	3	540	October 26, 2024

Surface cost for a thread run in the Assistants API

Related topics