Quota management for openai models

joyasree78 · October 10, 2023, 2:59pm

Is there a way to programmatically check the cost of using the models. And then if the cost goes over a threshold, suspend the API key

Thanks

grandell1234 · October 10, 2023, 3:12pm

Yes, you can set a limit on how many tokens are used. You can find more information here.

Foxalabs · October 10, 2023, 3:13pm

Hi,

Sure, you can track tokens sent and received with Tiktoken and then report those back from your API relay server, your end user application should authenticate with the API relay to uniquely identify each client.

OAuth or similar is suitable for this and that way you can build a usage database of each transaction and the model used, tokens sent, received and then run a billing cycle from that data on your preferred schedule.

joyasree78 · October 10, 2023, 6:12pm

Sorry, I probably did not ask the question correctly. Currently I can go to my account page and see how much I have spent so far. I was looking for a programatic way to get that information instead of logging in to my account and see. I am trying to create a resource monitor which will run every 5 minutes(for example) to see if any threshold has been breached.

Foxalabs · October 10, 2023, 6:15pm

There is currently no official billing API, you would need to use the website dashboard for that.

https://help.openai.com/en/articles/6614209-how-do-i-check-my-token-usage

_j · October 10, 2023, 6:29pm

If you are logging the inputs and outputs, you can also measure the tokens your usage is consuming, and calculate for yourself.

I have a library I’ve been plugging away at with the goal of returning any model metadata one could desire. Besides servicing queries like the token cost of your database logging or the context length of selected models, one could use it as a AI function so an AI could answer, or a website backend. As part of scope creep, it is now part of a library openaiutils (so that it can never get done). Some of the tables (just snippets):

def init_price_classes():
    self.price_classes = {
        # language gpt-3.5, 4 (limits are base)
        "gpt4": {"price_in": 0.03, "price_out": 0.06, "price_train": -1, "limit_ktpm":10, "limit_rpm":"200"},
        "gpt4-32": {"price_in": 0.06, "price_out": 0.12, "price_train": -1, "limit_ktpm":10, "limit_rpm":"200"},
        "turbo": {"price_in": 0.0015, "price_out": 0.002, "price_train": 0.0080, "limit_ktpm":90, "limit_rpm":"3500"},
        "instruct": {"price_in": 0.0015, "price_out": 0.002, "price_train": 0.0080, "limit_ktpm":250, "limit_rpm":"3000"},
        "ft-turbo": {"price_in": 0.012, "price_out": 0.016, "price_train": 0.0080, "limit_ktpm":90, "limit_rpm":"3500"},

…

def init_model_list():
    self.model_list = {
        'gpt-3.5-turbo-16k':        {'price_class': 'turbo-16',   'endpoint': 'chatf', 'tokenizer': 'cl100k_base', 'context': 16385, 'cutoff': '2021-09', 'retire_date': ''},
        'gpt-3.5-turbo-16k-0613':   {'price_class': 'turbo-16',   'endpoint': 'chatf', 'tokenizer': 'cl100k_base', 'context': 16385, 'cutoff': '2021-09', 'retire_date': ''},
        'gpt-3.5-turbo-instruct':   {'price_class': 'instruct',      'endpoint': 'complete', 'tokenizer': 'cl100k_base', 'context': 4097, 'cutoff': '2021-09', 'retire_date': ''},
        'gpt-3.5-turbo-instruct-0914': {'price_class': 'instruct',   'endpoint': 'complete', 'tokenizer': 'cl100k_base', 'context': 4097, 'cutoff': '2021-09', 'retire_date': ''},
        'gpt-4':                    {'price_class': 'gpt4',       'endpoint': 'chatf', 'tokenizer': 'cl100k_base',  'context': 4097, 'cutoff': '2021-09', 'retire_date': ''},
        'gpt-4-0314':               {'price_class': 'gpt4',       'endpoint': 'chat',  'tokenizer': 'cl100k_base',  'context': 4097, 'cutoff': '2021-09', 'retire_date': '2024-06-13'},
        'gpt-4-0613':               {'price_class': 'gpt4',       'endpoint': 'chatf', 'tokenizer': 'cl100k_base',  'context': 4097, 'cutoff': '2021-09', 'retire_date': ''},

…


   def init_urls():
        self.endpoint_base = "https://api.openai.com"
        self.endpoint_url = {
            "chat": "/v1/chat/completions",
            "chatf": "/v1/chat/completions",  # model supports functions
            "tune": "/v1/fine_tuning/jobs",
            "oldtune": "/v1/fine-tunes",
            "embed": "/v1/embeddings",
            "complete": "/v1/completions",
            "mod": "/v1/moderations",
...

    self.embed_alias = [
        {'ada-search-document': 'text-search-ada-doc-001', 'reported': 'text-search-ada:001', 'dim': -0.03518218919634819},
        {'curie-search-document': 'text-search-curie-doc-001', 'reported': 'text-search-curie:001', 'dim': -0.02822701632976532},
        {'curie-search-query': 'text-search-curie-query-001', 'reported': 'text-search-curie:001', 'dim': -0.027747787535190582},
        {'ada-similarity': 'text-similarity-ada-001', 'reported': 'text-similarity-ada:001', 'dim': -0.01897580549120903},
        {'curie-similarity': 'text-similarity-curie-001', 'reported': 'text-similarity-curie:001', 'dim': -0.010900501161813736},
        {'davinci-search-document': 'text-search-davinci-doc-001', 'reported': 'text-search-davinci:001', 'dim': -0.00792695488780737},

…

def init_deprecation_replacements():
    self.upgrade_auto = {
        'ada': 'babbage-002',
        'babbage': 'babbage-002',
        'curie': 'davinci-002',
        'davinci': 'davinci-002',
        'gpt-3.5-turbo-0301' : 'gpt-3.5-turbo',

…

    self.stable_alias = [
        {'gpt-3.5-turbo': 'gpt-3.5-turbo-0613', 'reported': 'gpt-3.5-turbo-0613'},
        {'gpt-3.5-turbo-16k': 'gpt-3.5-turbo-16k-0613', 'reported': 'gpt-3.5-turbo-16k-0613'},
        {'gpt-4': 'gpt-4-0613', 'reported': 'gpt-4-0613'},
        {'gpt-4-32k': 'gpt-4-32k-0613', 'reported': 'gpt-4-32k-0613'},

joyasree78 · October 11, 2023, 12:28am

this is great, do you intend to make this library an open source one.

Thanks

Topic		Replies	Views
How to get the cost for each api call? API openapi , api-costs , o1	2	528	April 9, 2025
Does openAI support seeing cost per API key and allow limits to be set at API key level? API	13	5293	December 15, 2023
Proposal: Introducing an API Endpoint for Token Count and Cost Estimation Feedback api	4	1430	September 22, 2024
API available to parse openai models cost in real time? API gpt-4 , api	1	203	November 19, 2024
Query api models and pricing alternatives API	0	136	August 20, 2024

Quota management for openai models

Related topics