New Embedding model input size

Do the newer Embedding models have a different maximal input token limit? Or is it still 8191 tokens?

Same length. Disable a token count check and run:

openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 8192 tokens, however you requested 8444 tokens (8444 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.”, ‘type’: ‘invalid_request_error’, ‘param’: None, ‘code’: None}}

However, I noted that in this list of stuff I sent, only the “big one” was exactly reported.
Sending a list to 3-large:

[Total tokens for 5 embeddings] Counted: 16011; API said: 16011
[Total tokens for 10 embeddings] Counted: 56011; API said: 56011

So the API call can have arrays that total more than the 8k.

2 Likes

Interesting … I wonder if they are truncating to 8k for you and not throwing the error? :thinking:

Truncation of inputs was never an option, intended or otherwise.

Sending an array of strings, which is one of the types of inputs that embeddings can receive along with a bare string, would give an error previously if the total tokens of the request exceeded 8k.

I suspect that there was some clever engineering, and may still be, where a model’s context length was broken into independent segments with state resets so that multiple vectors could be generated from one model loading.

The limitation is now a huge number of arrays.

I made a single request as big as 65k, with multiple list items of 8k, and there was no error. There was also no point at which any individual string of the array (list) was damaged, they all compare as identical.

== Cosine similarity and vector comparison of all inputs ==
0:“Jet pack” <==> 1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
0.25561484806957313065 - identical: False
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 2:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
1.00000000000000000000 - identical: True
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 3:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
1.00000000000000000000 - identical: True
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 4:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
1.00000000000000000000 - identical: True
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 5:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
1.00000000000000000000 - identical: True
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 6:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
1.00000000000000000000 - identical: True
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 7:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@”:
1.00000000000000000000 - identical: True
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 8:“tonal language”:
0.11351222035609268013 - identical: False
1:“!@!@!@!@!@!@!@!@!@!@!@!@!@!@!@” <==> 9:“It’s greased lightning!”:
0.27691531937414454179 - identical: False