How to calculate tokens from binded data of vector database

ahmed.elmaghraby207 · August 1, 2023, 10:05am

I need to know how to calculate the used tokens and the how to estimate the coast when using vector database when OpenAIEmbeddings, Also I need to know how the vectors counted as tokens

Foxalabs · August 1, 2023, 10:08am

You can use tiktoken to tokenise and count the input text GitHub - openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Then you multiply the cost per 1000 tokens by the number of tokens in your corpus divided by 1000

A rough estimate of the token count is 1/4 of the number of bytes in the dataset.

ahmed.elmaghraby207 · August 1, 2023, 10:32am

Thank you for your reply, I’m using nodejs could you pls tell me the best way to implement it using nodejs?

Foxalabs · August 1, 2023, 10:42am

Sure, you an check out GitHub - ceifa/tiktoken-node: OpenAI's tiktoken but with node bindings

ahmed.elmaghraby207 · August 1, 2023, 11:11am

Thank you,
I have a question pls: If I have list of 100 products with details for each one and embedded them then send the data to openai, Will the tokens count same as the tokens of products before embedding

Foxalabs · August 1, 2023, 11:25am

You can embed your company documentation 1 time, then you can run a query on that embedding database, each time you run a new embedding database query you must first generate a vector to search against, the cost of this is very small $0.0001 per 1000 tokens as it uses the ada model. Once you have created the vector and used it to retrieve your context, you can then call the standard gpt3.5 or gpt-4 model with the embedding data as context and run your original query with that.

Topic		Replies	Views
How to calculate the cost and tokens when using Faiss vector database with openai through LangChain API embeddings , langchain , token , vector-db	9	6846	December 18, 2023
Calculating embeddings costs API	8	9538	September 5, 2023
Is there a way to set a Token Limit for the OpenAI Embedding API Endpoint? API embeddings , token	1	1464	August 22, 2023
Token Calculator feature available as API API	6	4616	December 17, 2023
Navigating OpenAI Embeddings API Pricing: Token Count vs. API Calls API	6	7618	March 12, 2024

How to calculate tokens from binded data of vector database

Related topics