Open AI Assistants : how to get the token count?

Gyog · January 12, 2024, 2:02pm

Hi there,

I would like to create a platform where users can setup an OpenAi Assistant and then use it to create threads.

I would like to retrieve the token consumption of each API call that the user makes so I can charge him/her accordingly.

I don’t seem to find any mention of this anywhere. Looks like it’s only available when using Completions.

I’ve looked at the headers of my requests, no field concerning token consumption.

Do you guys know how I can get this information through API calls? Thank you

joshua.sukowski · January 12, 2024, 2:10pm

You could use a platform like langfuse or for just the tokens of text you can use the tokenizer from openai.

Gyog · January 12, 2024, 2:14pm

So I would need to make an API call for each API call I make? Is there not a way for each API call to return the token consumption as well ?

joshua.sukowski · January 12, 2024, 2:19pm

I am not sure about the exact use of both. But the Tokenizer is just a lokal programm you can call without api. And as i understand it, langfuse is like a api layer between you and OpenAI.

Gyog · January 12, 2024, 2:31pm

Using langfuse seems like a lot of work only to retrieve a token amount. The problem is that OpenAI assistants are still on Beta so not finished. But no way if they’re planning on returning the token consumption…

jorgeintegrait · January 12, 2024, 3:01pm

Short answer is that you can’t really predict the token usage at the moment, particularly when using functions or the available tools (retrieval, code interpreter).

This has been widely requested but as far as we know, there has been no improvement on the Assistants API since its initial beta launch on November.

Also, the tool itself has been widely replicated. I’d encourage you to build something that goes beyond a UI for the assistant API.

Best of luck,

yga · January 12, 2024, 3:06pm

No way for getting it through API calls
The way I’ve been doing it is just counting tokens and simulating the completion calls lifecycle, and it’s close enough

Gyog · January 14, 2024, 9:53pm

Thanks Jorge, I think I’ll wait till they improve this part. Thanks for your help anyway

rmncldyo · January 31, 2024, 3:49pm

Assuming you are following the steps outlined here on how to use the Assistants API, the following method should work for you!

Using the assistants endpoint, you can access the token count via the runs object.

After you’ve create a thread, added a message to the thread, and run the assistant, you must wait for the run to finish. The run data includes the usage (prompt tokens, completion tokens & total tokens) in the output.

The output will look like this:

"usage": {
     "prompt_tokens": 123,
     "completion_tokens": 456,
     "total_tokens": 579
}

This data is available via the List runs, List run steps, Retrieve run, Retrieve run step, and Modify run endpoints.

You can create a function that adds each individual API calls token count by run or even add up the total of each run to get a total (conversation) count.

You can see what data is available from the run object here.

Good luck!

sales33 · February 29, 2024, 10:39am

if (Run.status === “completed”) {
console.log(“\n”);
console.log(`terminal’,Run.usage.total_tokens)
}

dev48 · March 7, 2024, 10:08am

Hi @jorgeintegrait , sorry to bother you, but I wanted to know if there has been any improvement yet. I’ve been working with an assistant and using retrieval as a tool, and when I use run.usage , the number of tokens that I get is not the same as mentioned on the API. Even for the Number of requests, if I use the assistant once, the number of requests on the API is about 16. Thank you in advance.

jorgeintegrait · March 7, 2024, 8:29pm

The important number is the token count, not so much the API calls, as OpenAI doesn’t charge per call.

Take a look at the usage screen and do a very slow test (their usage screen sometimes takes a while to update) and compare those results with the usage results from the API.
i.e.

Have no usage for 10 minutes.
Send one message to your assistant
Measure usage from API.
Wait ~5-10 minutes. Analyze the usage from https://platform.openai.com/usage

If your numbers don’t match up, you can report it here.

dev48 · March 12, 2024, 10:16am

Hello @jorgeintegrait , I followed the steps you mentioned but the numbers obtained from the run.usage.prompt_tokens and run.usage.completion_tokens metrics are 48566 and 1116, respectively. However, the API reports different figures for context tokens (57477) and generated tokens (1189).I don’t quite understand where the problem is, or perhaps I’m using a functionality incorrectly.

jorgeintegrait · March 14, 2024, 6:57am

This adds up with my testing, the token reporting of the API seems unconsistent at the moment.

What I imagine is that it is couting some output tokens as context tokens again and that makes the difference between what the API sees and the back end reports. It is also possible that they even yse different calculations.

At the moment, it seems more of a guide than a definite cost estimation. Nonetheless, you can at least use this information on your project, and estimate that the context cost could be up to 20% higher than the token count returned fomr the API.

If you provide details of timestamps (when you made your requests) it is possible that someone fmor OpenAI can add that information to the issue report and help fix it, as I imagine this is an issue already known to them.

In any case, best of luck! and sorry that there isn’t a perfect answer atm.

jinyangbao · June 15, 2024, 5:17am

If I call assistant in the streaming mode, like :with client.beta.threads.runs.stream method. how to retrieve the token in the the run object?

jorgeintegrait · June 17, 2024, 3:59pm

You could make a call to get the run by id once the stream finishes

kaycee96 · July 23, 2024, 11:11am

You can get this using : stream.get_final_run().usage

Topic		Replies	Views
How to count TOKEN in a thread with API API assistants-api	3	465	October 26, 2024
Assistant API tokens usage API api-usage , assistants-api	9	1828	November 14, 2024
Impact of Instruction Size and Thread Length on Token Usage in OpenAI Assistant API api , cost	8	3852	May 21, 2024
How can I calculate the context token in Assistant API? Community assistants , assistants-api	4	1964	March 20, 2024
Do Assistant-called function outputs count towards input tokens? API	8	2036	January 12, 2024

Open AI Assistants : how to get the token count?

Related topics