Help on batch / quota embedding

adidier · May 16, 2024, 4:05pm

I love to have help on quota / token usage.

I’m using embedding with ada v2, but every time i’m using list i dont know how much element i can take in a single call.

Do you guys have some docs or have some techniques to select the maximum element in a list (string representing user profile) to maximum the number of token.

As well as i would love to do the same with batch embedding and the number of concurent request i can send.

I’ve check tiktoken docs, but nothing really usefull…

Ty in advance,
Adrien

_j · May 16, 2024, 7:36pm

Embeddings lists are up to 2047 strings.

ada-002 has a max of 8192 total tokens per API call. With the new 3-large models, I have yet to hit a limit at 100k+ tokens. You aren’t really saving anything by putting a whole bunch in one API call.

You can just add token count metadata to every chunk you are preparing, then just keep adding until the last one token count would exceed the accumulated limit of what you want to put in an API call.

tiktoken indeed can count. You just encode chunks to a token list, and then measure the list length. Two lines of code.

Batch API can make any API call you can normally complete. It just has its own limits of number of calls per file and number of tokens per file and file size. If you can wait 24 hours, 50% discount.

Topic		Replies	Views
Embedding model token limit exceeding limit while using batch requests API embeddings , token , batching	8	23607	October 15, 2023
Embedding large number of sentences API	13	11030	December 25, 2023
Embeddings API Max Batch Size API	2	8409	February 26, 2024
TokenLimit increasing on embedding api API embeddings , chatgpt	0	76	February 8, 2025
Token counting in batch api/text embeddings API	4	63	April 18, 2025

Help on batch / quota embedding

Related topics