Navigating OpenAI Embeddings API Pricing: Token Count vs. API Calls

kouravleen1234 · July 6, 2023, 11:03am

On what basis we will be charged for using Openai embeddings API (openai.Embedding.Create). It shows in pricing that we will be charged on the basis of token for example if we use Ada it charges $0.0001/1k Token.Is it not related with Number of API calls?
oe else can you please explain me the pricing for embedding API

EricGT · July 6, 2023, 11:21am

Welcome to the forum!

As a moderator I simplified your title.

How you may ask.

I don’t know the conditions for who has this enabled in the Discourse editor but in the menu bar is this icon

I then used the Suggest topic titles option and picked one.

For all those with Trust Level 3 and higher, use the gas pedal.

Foxalabs · July 6, 2023, 11:25am

1 token = approximately 0.75 words or 1k tokens = 750 words, you pay per 1000 tokens $0.0001
Using that it can be shown that you get about 4 characters per token or 4Kb of embedding text per 1k tokens or $0.0001
Using that as your basis you can approximate the cost of your embedding by :
Cost in $ = Size of Data in Kilobytes * 0.000025

API calls in this context are not a factor

hrkpatel · July 6, 2023, 12:15pm

Hello,
The calculation of token is quite complex but you can visit OpenAI Platform and try some text there and check how many token it is using and by that you will get rough idea about the pricing.

kouravleen1234 · July 6, 2023, 12:53pm

Thankyou so much for your reply . I got it

danimaraas · March 12, 2024, 8:19am

I want to be sure that I have understood your answer correctly. We have a chatbot project where we have embedded some data. The embeddings are stored in a vector database in Azure. What I can’t fully understand from the documentation alone, is whether future prompts also will have an embedding cost.

In other words, will future prompts be priced based on token usage with regards to the language model (prompt tokens + completion tokens) in addition to the embedding cost of the prompt tokens?

Foxalabs · March 12, 2024, 9:40am

Hi,

Yes, each “query” you generate will also have to be vectorized. The newly generated vector is then compared via a similarity function to all of the other vectors stored in your database. What’s happening is that the “hard work” is being mostly done up front when you vectorize your dataset, you then run a single embed per new query and use the very cheap to compute similarity function across your data.

To cliff notes that:

Yes you need to embed (vectorize) all of your data, you then also need to vectorize each new question you ask of the dataset.

Topic		Replies	Views
How to calculate the cost and tokens when using Faiss vector database with openai through LangChain API embeddings , langchain , token , vector-db	9	7091	December 18, 2023
Calculating embeddings costs API	8	10141	September 5, 2023
How to use AI to make Wordpress Ai bot API chatgpt-plugin	5	1851	September 28, 2023
How to reduce prompt tokens price API embeddings	3	1305	April 1, 2024
How to calculate tokens from binded data of vector database Community token , vector-db	5	3847	August 1, 2023

Navigating OpenAI Embeddings API Pricing: Token Count vs. API Calls

Related topics