Speeding up large task over the API

I have 25,000 objects which need labeling, and I’ve taken a (hopefully) representative sample of 1000 and manually labelled them myself. It’s not a super complex task, but there are a lot of edge cases. I don’t expect the model to perform perfectly, but it beats labeling all of them.

I did only a subset, 1200 products, and that took 90m (with a 1sec sleep) and a few failed due to server side issues.

Is there some way to do this quicker while being robust?

Here is an openai example of batching parallel requests to the API:

The fastest model for token production, and the most reliable model for data output without accidental chat once you’ve adapted your prompt for it, is going to be the gpt-3.5-turbo-instruct completion endpoint model. You can give direct short instruction statements to it.

1 Like