Hi Raymond, no I haven’t used this feature myself.
But would just count the characters and limit it this way. I forget the equation of characters to tokens, but for Davinci I make sure I stay under 10k characters. But I think the ada-002 embedding is 4x larger, so maybe cap it at 40k characters per call? Probably go larger too.
Personally I would just let it run at the slower rate instead of re-writing your code. But you would have a nested loop where you loop over the whole thing until you gather up enough data that doesn’t exceed the 8k token limit, then send it off to the API, get it back, and stuff the results into the frame and keep going to the next chunk. Not trivial.
Normally I just work with a database, and so I can query if I’ve already embedded something before sending it off to the API. Keeping the code simpler but waiting longer is acceptable in my situation.
I have a background process that chugs away whenever it sees a new series of text to encode. I trigger it to look every minute, and then if it finds records it keeps processing them for 50 seconds, and then restarts on the minute. (Was simple to code this way)
I did 7,000 rows in about 15 to 20 minutes - so it’s not a problem at this stage. My text blocks are 500 tokens or less.
My embeddings are stored in a proprietary vector database and linked to a supporting MS-SQL database for the original text and related data.