Cost of search endpoint (Number of tokens in your query)

I read the following QA link pape

However, I couldn`t understand the counting method of tokens about search endpoints.

Total tokens = Number of tokens in all of your documents
+ (Number of documents + 1) * 14
+ (Number of documents + 1) * Number of tokens in your query

I would like to input the data of conversation into OpenAI API
In the data of conversation, each statements are separated.

In this case, does the above “Number of tokens in your query” variable mean the number of total tokens of all statements on our dataset ?

Or does that mean the number of tokens of each statements on our dataset ?

For example, there are 10000 statements, and the average number of tokens of each statement is 100. and there is the only one documents.

If the first is true, I think Total tokens is
Total tokens = (10000×100 + 2×14 + 2×10000×100)

If the second is true, I think the usage cost is very high.
I think Total tokens = 100 ×(10000×100+ 2×14+2×100).

Which is true?
Or is both false?

Weird, haven’t seen much on their search service. Why wouldn’t you just create embedding vectors from the documents and search that way? If you use their embedding engine (latest is text-embedding-ada-002) you would only pay once for what you embed. But you would pay for your own compute and database to retrieve the closest search results.


Unfortunately the search endpoint is depreciated. So the pricing doesn’t apply


sorry, I must explain the purpose for which I would use search endpoint.

I would like to classify the statements, so I`ll use for Classifications.
on classfication pricing QA in this link,

Internally this endpoint makes calls to the search and completions endpoints

So, I recoginized I must pay both search and completions endpoints for classifications.

Oh, thank you. I found “depreciated” on the following link

but, why do they mention about search endpoints on classification pricing QA in the following link?

I dont know so much about embedding engine. So, Ill study about it. thank you

1 Like

In the given formula for counting tokens, the variable “Number of tokens in your query” refers to the number of tokens in the search query that the user inputs. It does not refer to the number of tokens in the documents or statements in the dataset.

Therefore, in your example, if a user enters a query with, let’s say, 10 tokens, the total tokens would be calculated as follows:

Total tokens = (10000 * 100) + ((1+1) * 14) + ((1+1) * 10)
= 1,000,014 tokens

So, the answer is neither 1 nor 2. The correct answer depends on the number of tokens in the user’s search query.

The embedding is a different price to the completion. Take the number of tokens for embedding and divide by 1000 and then multiply by the embedding rate for ada002

That is the fixed cost of training

To make a query:

Take the tokens for the users query and divide by 1000 and then multiply by the embedding rate for ada 002 to get the cost to derive a vector you can use for searching (A)

Then add together the tokens for the snippets you find using semantic search (to use as a context) and the tokens for the actual question (including the extra text where you might say “referring to the following context, answer the question below.”)

Divide that total by 1000 and multiply by the davinci003 rate to get the cost of asking the question. (B)

When you get the answer or completion back, divide the tokens used by the completion part by 1000 and multiply by the Davinci rate to get the cost of your answer. (C)

Add A B and C. This is the cost of the query (give or take a couple of tokens)

The cost will depend on the question asked and the length of the contexts you use in the final query

1 Like

Thank you, AlexDeM, and raymonddavey.

I understood tokens and cost calculation of each endpoint.
Thanks to you, I could check to see if my cost calculations were correct.