How to calculate the cost and tokens when using Faiss vector database with openai through LangChain

I have built a chatbot using OpenAI Langchain and Faiss vector database and I need to know how to calculate the used tokens in all process [OpenAIEmbeddings, prompt template, user message, assistant message] Pls advice me in details to allow us go with this chatbot to the next stage

1 Like

Langchain is doing the requests for you, so you might want to check langchains docs for that.

The openai API answers with requested and answered tokens and also total tokens on each request like this:


“usage”: { “prompt_tokens”: 5, “completion_tokens”: 5, “total_tokens”: 10 } }


You just need to store it then e.g. into a csv file or a database.

I also suggest to calculate the tokens for the request and store them before you send the request because even when your request times out it might be charged.

Setting up different API tokens might also work depending on your use case.

What you have to pay for tokens is listed here:

The pricing of faiss depends on where you host it or what the distributor charges. Basically Faiss is an open source vector db you may host on your own server.


Thank you for your reply,
Could you pls tell me the way OpenAI handle the vector database and tokens, I just need to know after converting the data to vector data such as products list, How openai calculate the tokens for the vectors is the all product data converted to token or each word will be an separated vector and separated token.
I hope you understand my point

Basically there is an API request. That is what is send to openai API.

You can save the requested data and compare that with the usage object. I think that’s the safest way to find out unless the request fails somehow.

1 Like

OK, Will do that.
Only I need to know the functionality of sending data to openai from vector db
Is the vectors same as words for token usage?

The vector db will basically just help you to save on requests. Langchain will ask it for a response. You can also feed it with specific infos and when you make a request it will first call the vector db and if it can’t find a good answer there it will call openai api.

It’s like a cache that can semantically find results based on the request to reduce API calls (which also speeds up the answers - since the vector db “knows stuff” based on similar vectors and the open ai api calls are really slow while you have controll over the speed of the vector db).

Safe the usage of the response like I explained in my previous post and you are good to go.

Here is also a video that might answer some basic questions and shows how to use a vector db in front of openai requests.


Thank you,
I have a question pls: If I have list of 100 products with details for each one and embedded them then send the data to openai, Will the tokens count same as the tokens of products before embedding?

Vectorizing is something like giving a phrase, word or even whole text a number and numbers are closer when the text, phrase or word has higher similarity.
You are not sending this number to open ai - that is something that works internally. How would open ai know which “number” was assigned to your text inside your vector db?

And with embeddings it works quiet similar.

You embedd the data e.g. to an ada model and then you can ask that model for product informations.

But you still communicate in human language not in vector representations.

Tokens are calculated both on the human language words you are sending and the answer which is also human language.

You won’t talk to anyone in bits and bytes.

1 Like

Just stumbled upon something, that might be usefull as well:

1 Like