Estimating GPT API Conversation Costs: Factoring in Cumulative Input and Output Tokens

sir.jingo · October 2, 2023, 1:49pm

Hello everyone,

Recently, I was tasked with estimating the average cost of a conversation using GPT API’s. The challenge was to factor in both input and output tokens, considering that each subsequent reply includes the entirety of the previous conversation as input.

The function I’ve devised does the following:

For each new reply, it assumes that the chatbot takes into account all preceding messages (from both the user and the chatbot itself) as its input.
It then calculates the cost for both input and output tokens for each interaction.
Additionally, it checks if the token count for any interaction exceeds a specified limit, and raises a warning if it does.

Since I haven’t come across a similar solution elsewhere, I wanted to reach out to the community.

Can anyone confirm if this approach seems sound? If it proves to be useful, I hope it serves as a reference for others in the future. Your feedback is much appreciated!

def chatbot_cost(num_replies, avg_user_reply, avg_bot_reply, input_cost, output_cost, token_limit):
    total_cost = 0
    total_tokens = 0
    
    # Lists to keep track of the lengths of all replies
    user_replies = [avg_user_reply] * num_replies
    bot_replies = [avg_bot_reply] * num_replies

    for i in range(num_replies):
        # Calculate total tokens for this interaction
        input_tokens = sum(user_replies[:i+1]) + sum(bot_replies[:i])
        output_tokens = avg_bot_reply

        if input_tokens + output_tokens > token_limit:
            raise Warning("Token limit exceeded!")

        # Calculate cost for this interaction
        interaction_cost = (input_tokens * input_cost) + (output_tokens * output_cost)
        total_cost += interaction_cost

        # Update total tokens
        total_tokens += input_tokens + output_tokens

    return total_cost

Needless to say, the function was written by gpt4

_j · October 2, 2023, 2:46pm

I have a class that will extend the chat endpoint message itself with a value of the token count, whether you pass one role message or the entire list of them you send to the AI API. Thus, you get the actual count in the message dictionary key “tokens”, along with “role”, “name”, and “content”. If the individual message is included in what you send, token count increase is included.

This allows storage in chat history with a per-message token count, so one can calculate exactly what can be passed of chat history into the remaining context.

I also use a send method that strips metadata out though (but you could just use the return[0][‘tokens’] as your count). Wouldn’t be too much more to extend the class with a method to take all messages in the list for a total.

sir.jingo · October 2, 2023, 3:34pm

Thank you for your response!

Your approach is valuable for calculating the tokens based on actual usage, and I’ll definitely consider it for tracking tokens in real-time conversations.

However, I might not have been explicit enough in my initial post. My primary aim is to predict the cost for hypothetical conversations that haven’t occurred yet, rather than analyzing already-existing data.

Topic		Replies	Views
How can we count the used tokens in a conversation? API gpt-4 , chatgpt	2	5277	May 17, 2023
Need help on how to approach the API usage metric for user of the app API	16	1786	January 3, 2024
How much are the cost of this example in GPT-4? API	3	1026	January 19, 2024
GPT-4 pricing for chat API API gpt-4	10	49988	July 11, 2024
Pricing, Billing and Tokens? Math is not adding up API api	9	2443	February 16, 2024

Estimating GPT API Conversation Costs: Factoring in Cumulative Input and Output Tokens

Related topics