How to control the expenditure of a budget?

it.viktor · January 9, 2024, 2:40pm

Hi,

I dev an app/connector (OpenAI_Assistant ↔ MyApp ↔ Business_chats) That connects chats for clients with OpenAI assistants.

I want to provide it to clients. But I had a question:

Let’s say I have 3 clients, I created an apikey and assistant for each, client accesses his assistant through his own API key.

It turns out that they all use one common deposit of my account? Which I am replenishing. How can I control/limit the API keys?

For example, clients allocate different amounts for the work of their openai assistants: $10, $70, $100.

I have a total of $500 on my balance.

How to control consumption?

What possible steps do you currently see for this situation?

Regards, Viktor

Diet · January 9, 2024, 2:53pm

Hi Viktor!

OpenAI doesn’t offer a solution to this. You need to create a gateway that tracks usage and billing (“API Monetization”).

I expect that you’d wanna either offer a fixed price product with a cost (“fair usage”) ceiling, or you’d want to include an uplift to cover your ancillary costs.

In either case, you need to manually track your users’ actual or statistical spending.

it.viktor · January 9, 2024, 3:17pm

Hi Diet, thank you for message!

Currently, I see the simplest course of action as registering an OpenAI account for each client, which they can top up themselves. My application would then be offered on a subscription basis. However, from a service perspective, this is not convenient.

Otherwise, as far as I understand, I would need to calculate how many tokens they use, how much it costs, etc. But I don’t see the point in doing this. Since OpenAI already does this, what I calculate may not correspond to what OpenAI calculates.

In fact, OpenAI already does all these calculations, and what I need, they can very easily implement. Just have to wait?

Diet · January 9, 2024, 4:48pm

It’s possible

they’re rolling out limited tracking, https://platform.openai.com/usage (right hand side) but for now it only shows how many calls and to what models these calls were made.

I wouldn’t wait on features that they haven’t announced (but tbh I wouldn’t even wait on things they do announce).

it.viktor · January 9, 2024, 5:57pm

Thank you, I’ve seen it.

Another option for token accounting using standard methods. You can use: https://platform.openai.com/docs/guides/text-generation/chat-completions-api
It returns data:
“usage”: {
“completion_tokens”: 17,
“prompt_tokens”: 57,
“total_tokens”: 74
}

But in my case, it doesn’t fit. As far as I understand, chat-completions is a one-time question-answer, not a dialogue with history.

Diet · January 9, 2024, 6:07pm

ooh.

well, the way it works is that when you have a conversation, you append the new messages to the old message list and send the whole shebang against the AI again. so you do have a square growth in input token cost, the longer the conversation gets.

if you use streaming, it won’t compute that for you, but you can use tiktoken to get a really good estimate.

edit: I’m wondering, did you choose assistants because you didn’t know you could do it with chat completions? The cost would be similar or less by using the chat completion api directly.

it.viktor · January 9, 2024, 6:45pm

In my application, the work with the API is implemented like this:

#python
from openai import AsyncOpenAI

clientAI = AsyncOpenAI(api_key='secret_key')
thread = await clientAI.beta.threads.create()
message_ai = await clientAI.beta.threads.messages.create(thread_id=thread.id,
                                                         role="user",
                                                         content=message)
run = await clientAI.beta.threads.runs.create(thread_id=thread.id,
                                              assistant_id='asst_id')
run_status = await clientAI.beta.threads.runs.retrieve(thread_id=thread.id,
                                                       run_id=run.id)
messages = await clientAI.beta.threads.messages.list(thread_id=thread.id)

There is also additional logic for processing assistant functions, when:

run_status.status == 'requires_action'

I have a square growth in input token cost?

As I understand it, in my case, I need to count it myself, for example, through tiktoken

But I don’t want to do this, because there might be discrepancies in calculations with OpenAI

Why calculate myself what OpenAI is already calculating?

Diet · January 9, 2024, 7:20pm

Yeah, but it’s extra complicated with threads. I’m not sure if it still does auto truncation and it depends on if you use runs, but in general there’s a lot of debate around cost control with the assistants.

OpenAI will probably eventually release tools that help you track this (unless they decide to retire assistants completely)

In my biased opinion, the decision to use assistants in a product comes with a ton of risks and not many rewards. You can achieve the same thing with other tools, but I understand that it may be easier for novice developers to get started by just using assistants.

ChickenTwisty · January 9, 2024, 11:03pm

I am that noob, such that I know not of the risks you speak, could you please explain?

From my perspective having assistants that tackle different problems or approaches to problem solving and threads for different users interacting seems pretty useful.

Diet · January 9, 2024, 11:13pm

Well, in the short term you have uncontrollable costs, at least that’s my understanding.

In the long term, you have vendor lock-in.

it.viktor · January 10, 2024, 12:47pm

Thank you for opening my eyes to the real pricing of the API. Before this, I thought I was only paying for the message and response. I read this topic: “Assistants API pricing details per message” and i’m outraged.

Probably I will postpone my idea until better times.

globals3d · January 10, 2024, 1:20pm

I have similar tasks, I came across this service ChatGPT | Team-GPT (team-gpt.com), it seems to be able to track keys and adjust the number of requests.

workhubmarketing · February 9, 2024, 1:15pm

You can also use WorkBot, which is The Only AI Platform you will ever need!

Topic		Replies	Views
API vs 20$/month subscription for those who use ChatGPT as an assistant (webmasters, programmers, SEO specialists...) Community chatgpt	20	31346	April 2, 2024
Woa 35k tokens in one go ?!?!?!?!?!?!? API gpt-4 , assistants-api	11	1156	February 14, 2024
SOS: ALARMING Situation of Excessive Billing Threatening the Survival of my Company AI Project GPT API api-billing	20	2697	May 28, 2024
Is it possible to know the costs from API calls after the call? API api	11	6958	February 12, 2025
How exactly do you get charged for using the API for assistants? API assistants-api	33	7488	November 27, 2023

How to control the expenditure of a budget?

Related topics