Is there a way to set a Token Limit for the OpenAI Embedding API Endpoint?

mano-wii · August 22, 2023, 3:56pm

The examples in the embeddings guide use the tiktoken library to count tokens before a request.

But since I’m creating a serverless extension, it doesn’t seem convenient to bundle complex libraries into it (I could be wrong, I’m not familiar with Node.js libraries).

Is it possible to limit the tokens in the HTTP request itself?

If it’s not possible, considering that I’m still new to programming, how can I include a tokenizer in my extension?

This is the function I use to request the embeddings:

async function OPENAI_Embedding(texts) {
  const apiUrl = 'https://api.openai.com/v1/embeddings';
  const requestOptions = {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${OPENAI_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      'input': texts,
      'model': 'text-embedding-ada-002'
    }),
  };

  try {
    const response = await fetch(apiUrl, requestOptions);
    const data = await response.json();
    STORAGE_tokensUsed_add(data.usage.total_tokens);

    return data.data;
  } catch (error) {
    console.error('Error making request to OpenAI EmbedTexts:', error);
    return null;
  }
}

_j · August 22, 2023, 4:13pm

Language models also don’t have a way of setting a limit for the inputs. What you send is what you pay for.

The output of embeddings is not tokens but a vector aka tensor.

To count, you can use the tiktoken library. What you do with the information about the true size is then up to you. text-embedding-ada-002 uses the same 100k token dictionary as chat models.

“serverless” is not consistent with the typical use of embeddings, which is similarity matching or database retrieval on a large sampling of existing data to compare.

Topic		Replies	Views
How to calculate tokens from binded data of vector database Community token , vector-db	5	3872	August 1, 2023
Token usage when using openai.chat.completions.create stream: true API gpt-4 , token	7	4461	November 4, 2023
Please, add a text token counter API endpoint Feedback	3	48	October 9, 2024
Using a Custom Tokenizer with GPT Embeddings API	5	3662	March 4, 2024
Token counting in batch api/text embeddings API	4	80	April 18, 2025

Is there a way to set a Token Limit for the OpenAI Embedding API Endpoint?

Related topics