Problems using Embedding API

flavio.cordova · May 5, 2023, 1:45am

I’m having a very odd problem using embedding api using python client.

I have some thousands of documents I want to get processed and send them in batches of 30 each.

Sometimes it works fine, all 30 documents are processed and I get a nice array of 1536 values for each document. But sometimes, the API returns a 12288-len array instead, which is not only is useless to me (because I can’t combine them with the previous 1526-len arrays) but I causes a huge peak on my consumption.

On my usage report I’ve noticed that when it works It logs requests to text-embedding-ada-002-v2, which is exactly what I expected. When it returns the bigger array and cost, I logs calling text-similarity-davinci and I have no idea why.

Due the inconsistent and unpredictable behavior I tend to assume it’s a bug (a very expensive one) but maybe you guys could give a hint.

the code is pretty straightforward:

def get_embedding(text, model="text-embedding-ada-002"):
    text = text.replace("\n", " ")
    result = openai.Embedding.create(input = [text], model=model)
    embeddings = result['data'][0]['embedding']
    return embeddings

textsDF['embedding'] = textsDF.progress_apply(lambda row: get_embedding(makeText(row)), axis=1)
```

lenwhite6094 · July 1, 2023, 2:12pm

Have you gained any insights into this? I send individual requests and was considering sending batches, but I would want to know that I won’t have the same issue.

Topic		Replies	Views
Embedding model token limit exceeding limit while using batch requests API embeddings , token , batching	8	25002	October 15, 2023
Embedding large number of sentences API	13	11587	December 25, 2023
Embedding batches: randomly getting "Enqueued token limit reached for text-embedding-3-large" for rather small batches Bugs batch-api	1	141	February 18, 2025
Embedding creator indeterminism API	0	516	October 19, 2022
Embeddings API extremely slow Feedback embeddings , api-embedding	9	1273	April 24, 2025

Problems using Embedding API

Related topics