Computing the cost of embedding requests

a.bernardini1 · July 5, 2024, 9:55am

Hello everyone!

I’m developing a code to compare different retriever techniques using LangChain and OpenAI embedding (currently text-embedding-ada-002) to find the best retriever for my specific use case in terms of time, cost, and efficiency.

I’m comparing the following retriever modalities:

Base Retriever (with different chunk sizes and overlaps)
Contextual Compressor Retriever
Ensemble Retriever
Multi Query Retriever
Parent Document Retriever
Time Weighted Retriever

To compute the cost of vector store setup, I am using this technique:

model_cost = 0.10 / 1000000
total_tokens = 0

splits = text_splitter.split_documents(docs)
encoding = tiktoken.encoding_for_model("text-embedding-ada-002")

for chunk in splits:
    total_tokens += len(encoding.encode(chunk.page_content))

total_cost = total_tokens * model_cost

However, I am not sure how to compute the cost of each request to the retriever. For instance, the cost of this request:

snippets = retriever.invoke(query)

I tried using get_openai_callback, but it didn’t work.

Can anyone help me, please?

_j · July 5, 2024, 3:06pm

Once you have embedded your database, there is no additional API cost except for the single call to get an embedding value of your new search criteria, by tokens sent.

Everything else is performed algorithmically.

More advanced methods can perform data transformation on your search input by AI or otherwise send the query tokens in different manner, such as once in whole and then by sentence splits.

The implementation is up to you, and for you to understand.

Topic		Replies	Views
Calculating embeddings costs API	8	9598	September 5, 2023
Navigating OpenAI Embeddings API Pricing: Token Count vs. API Calls API	6	7681	March 12, 2024
Do I misunderstand retrieval pricing? API gpt-4 , api	2	1148	January 3, 2024
Cost when building chat with text with embeddings and chatgpt 4-128k API embeddings , gpt-4 , chatgpt	6	4291	December 22, 2023
How to calculate tokens from binded data of vector database Community token , vector-db	5	3662	August 1, 2023

Computing the cost of embedding requests

Related topics