Help on batch / quota embedding

I love to have help on quota / token usage.

I’m using embedding with ada v2, but every time i’m using list i dont know how much element i can take in a single call.

Do you guys have some docs or have some techniques to select the maximum element in a list (string representing user profile) to maximum the number of token.

As well as i would love to do the same with batch embedding and the number of concurent request i can send.

I’ve check tiktoken docs, but nothing really usefull…

Ty in advance,
Adrien

Embeddings lists are up to 2047 strings.

ada-002 has a max of 8192 total tokens per API call. With the new 3-large models, I have yet to hit a limit at 100k+ tokens. You aren’t really saving anything by putting a whole bunch in one API call.

You can just add token count metadata to every chunk you are preparing, then just keep adding until the last one token count would exceed the accumulated limit of what you want to put in an API call.

tiktoken indeed can count. You just encode chunks to a token list, and then measure the list length. Two lines of code.

Batch API can make any API call you can normally complete. It just has its own limits of number of calls per file and number of tokens per file and file size. If you can wait 24 hours, 50% discount.