Proposal: Introducing an API Endpoint for Token Count and Cost Estimation

As the development community continues to leverage OpenAI’s advanced models, the need for more transparent and accessible budgeting tools has become apparent. The introduction of a programmatic way to estimate token usage and associated costs would mark a significant step forward in this regard, particularly with the utilization of cutting-edge models like GPT-4.

This proposal outlines the introduction of a dedicated API endpoint for token count and cost estimation and suggests enhancing model objects with pricing information to aid developers in making more informed decisions.

The Proposal

  1. API Endpoint for Token Count and Cost Estimation (count_tokens): This endpoint aims to provide developers with an efficient tool for estimating the number of tokens generated by their text inputs, alongside the expected cost, for a specific model, namely GPT-4.
  2. Incorporate Pricing Information into Model Endpoints: To further aid decision-making processes, it is proposed that model detail endpoints be updated to include essential pricing information.

Benefits

  • Accurate Cost Management: Enables developers to accurately manage and forecast their expenditures on the OpenAI platform.
  • Seamless Developer Workflow: Integrates directly into development workflows, allowing for real-time cost estimations without manual intervention.
  • Transparent Pricing: Offers clear visibility into pricing structures, promoting trust and reliability in the OpenAI ecosystem.

Suggested Implementation

Here is how the count_tokens endpoint could be implemented, using GPT-4 as an example:

POST /v1/count_tokens
Content-Type: application/json

{
    "model": "gpt-4",
    "text": "Your sample text goes here."
}

Expected Response:

{
    "tokens": 150,
    "cost": 0.08
}

To include model-specific pricing details transparently:

GET /v1/models

Sample Response:

{
    "models": [
        {
            "id": "gpt-4",
            "object": "model",
            "token_cost_per_thousand": "0.08"
        },
        // More model objects...
    ]
}

The introduction of the count_tokens endpoint, particularly with support for GPT-4, represents a critical enhancement to the OpenAI API, streamlining development processes and facilitating better budget management. Community feedback on this proposal is invaluable for refining and implementing these suggestions effectively.

They’ve already got a token counter. Set max_tokens to 33000 and you will get an immediate response that has measured the size of your input, including tools and functions and overhead, in the error message. If you want bare text measurement, you can send it to a completions model and parse that error.

The models endpoint would be where cost can come from.

1 Like

this sounds more like a workaround, whereas the proposal is a valid business case that addresses concerns for developers having to do workaround like you outlined above.

They have shifting into “revenue” mode. Your proposal is great, but means they have accountability that they would rather not have. Evidence? Switching to a pre-buy scheme that makes charges even more murky. As for it costing them computing cycles… If they were so concerned about that, they would differentiate high-value vector responses from rare value responses.