Overheads 'tokens'? Usage does not add up

ziqizhang · September 20, 2023, 10:41am

So I am experimenting with a conversation chain using LangChain to interface openai and checking the costs. I don’t understand how the first api call calculates a ‘prompt token’ number that is significantly higher than what I expect. Here is the code:

import langchain
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.callbacks import get_openai_callback
import tiktoken

llm = ChatOpenAI(model_name='gpt-3.5-turbo',
                      openai_api_key="MY KEY")
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)
with get_openai_callback() as cb:
    input="Hi, my name is Andrew"
    tokenizer=tiktoken.get_encoding("cl100k_base")
    toks=len(tokenizer.encode(text=input)) # line A
    print(toks)

    costs={"tokens used":0, "prompt tokens":0, "completion tokens":0, "successful requests":0, "cost (usd)":0}
    result = conversation.predict(input=input)
    costs['tokens used']=cb.total_tokens
    costs['prompt tokens'] = cb.prompt_tokens
    costs['completion tokens'] = cb.completion_tokens
    costs['successful requests'] = cb.successful_requests
    costs['cost (used)'] = cb.total_cost
    print(result)
    print(costs) #line B

On line A, I get 6, this is consistent with the openai tokeniser OpenAI Platform
But then on line A, the output says:

{'tokens used': 87, 'prompt tokens': 70, 'completion tokens': 17, 'successful requests': 1, 'cost (usd)': 0, 'cost (used)': 0.00013900000000000002}

Which suggests the prompt is 70 tokens but in fact I am expecting 6. Am I wrong? What am I missing here?

Thanks!

Foxalabs · September 20, 2023, 10:44am

Hi and welcome to the Developer Forum!

Langchain adds additional tokens to each prompt.

ziqizhang · September 20, 2023, 10:54am

Thanks for your quick reply! That is very useful to know…

I understand now this is a LangChain issue but I wonder if you could share a more specific reference (a link, or article) to this? Is this ‘overheads’ fixed or dependent on my input? I just want to know what I expect… I just did a search on their GitHub issues page but found nothing related to this.

Thank you again.

Foxalabs · September 20, 2023, 11:26am

So langchain is a framework to control LLM’s. It contains procedures, string handling and manipulation routines along with predefined and tested prompts built by the Langchain developers.

They take some of the effort out of building LLM based systems and in return you pay an overhead in terms of hidden prompts.

If you need full token level control of your prompting then Langchain is not the tool to use.

Topic		Replies	Views
How many tokens is normal usage for asking a question? API chatgpt	7	13547	September 6, 2024
Using the API the token count is off API	10	1535	January 16, 2024
Too much tokens - 1.794 tokens for 60 words - help to understand API chatgpt , api , api-rate-increase	7	1641	September 1, 2023
Prompt_tokens vs tiktoken.encoding_for_model().encode() Prompting gpt-35-turbo , token	4	4927	August 3, 2023
Prompt tokes are much lower than the number mentioned in the response API	6	71	January 10, 2025

Overheads 'tokens'? Usage does not add up

Related topics