Best Practices for Reliable Embeddings Pipeline

dan.meier.dmm · July 11, 2023, 3:25pm

I am building a system where I need to process large volumes of data for embedding and it needs to be robust to failure. Making concurrent API calls to OpenAI or Hugging Face is not fast enough for the volume of data we have, nor is it reliable enough. We have thought about building an internal queue system but we are not sure we want to maintain it. Has anyone used any of the services that do this and can recommend them?

wfhbrian · July 11, 2023, 3:33pm

OpenAI Embeddings API accepts batches of embeddings. I can’t find a reference to the upper limit of each batch, but I regularly use it for batches of 30+.

And it doesn’t take much to wrap the request in a try/catch block and to retry on failure. If it fails more than some limit, then stash the request body in a failed file for further inspection.

Not sure what type of service would provide a simpler workflow than that.

Topic		Replies	Views
Struggling to achieve fast, parallel, embeddings API embeddings , gpt-4 , api	1	798	December 5, 2024
Scaling OpenAI API for production use API gpt-4 , chat-completion , gpt-4-turbo , assistants-api , gpt-4o	3	761	July 29, 2024
Embeddings service seems to be very unreliable atm, anyone else seeing this? API embeddings	18	1400	February 15, 2024
Issue with bad requests and API timeout in Embedding calls API embeddings	2	1434	September 24, 2024
Issues with Rate Limiting and Batch Processing in OpenAI API Community api , batching	0	1995	November 11, 2023

Best Practices for Reliable Embeddings Pipeline

Related topics