What are the valid embedding input values?

acdavis629 · March 7, 2023, 3:57am

I am attempting to create the embeddings for each sentence in a book. So after removing all sentences less than 3 characters, I have a list of strings of length 2305. When I pass this list to the embedding input, I get "(LIST) is not valid under any of the given schemas - ‘input’. So it seems there are some values that are not being accepted. What are they? Are there certain characters which the embedding model does not accept?

wfhbrian · March 7, 2023, 12:06pm

It seems like you might be passing the entire array to the Embeddings API, instead of each individual sentence.

acdavis629 · March 7, 2023, 12:25pm

Yes I am. It accepts either a string or array. It works fine if I pass it a list of strings around 250 length. Is there a size limit?

wfhbrian · March 7, 2023, 12:49pm

I haven’t seen an limits specific to batch size, so maybe you’re hitting the 350,000 TPM rate limit?

If not, maybe there is a batch limit that just isn’t well documented.

wfhbrian · March 7, 2023, 12:51pm

If you’re still thinking this might be the problem, I would send one sentence per request so that you can see which is causing the error, and investigate it from there.

acdavis629 · March 7, 2023, 1:04pm

Yeah good suggestions. I was hoping for some documentation to clear up the guesswork.

Edit: @wfhbrian I was able to embed it by breaking the calls up. The problem was I was exceeding the 8192 token limit. Duh…

Topic		Replies	Views
Embedding large number of sentences API	13	10309	December 25, 2023
Embeddings API Max Batch Size API	2	7110	February 26, 2024
Embedding model token limit exceeding limit while using batch requests API embeddings , token , batching	8	22261	October 15, 2023
New Embedding model input size Documentation embeddings	3	2385	January 26, 2024
Embedding API change? $.input is invalid API embeddings , api	5	3996	September 3, 2024

What are the valid embedding input values?

Related topics