We are creating a rather complicated process or running batch API requests, and we need multiple interations to tweak it. waiting a day between tweaks will take a very long time to finish this task, is there a way to submit very small batch requests with only a few items and get a quick response for dev and testing?
The jobs are scheduled by OpenAI for when it is cheap for OpenAI and for you.
The size seems to have no bearing on the start time, and there is no “me first, please” mechanism.
A wait of 24 hours is not the guarantee, nor is there a guarantee that you even get more than a cancellation after 24 hours. A batch might be done in ten minutes or ten hours depending on conditions when you submit.
The unpredictability can making testing the batch API a bit of a pain.
In lmwrapper we have unit few tests that depend on it. We submit small batches here, and tend to get a response within a few minutes. However, we have to set timeout of ~30min with some accepted flakiness risk. Testing with the less-used models (eg, gpt 3.5 trubo) seems like could help batch latency. In other parts of our code or in downstream libraries we generally mock out as much we can to avoid actual LM calls.
But yes, for end-to-end testing submitting very small batches can be a solution.
Alternatively, depending on what you are building, you can use a scheme similar to lmwrapper where abstract away whether using the normal API or the batch API during testing/development. For example:
from lmwrapper.openai_wrapper import get_open_ai_lm, OpenAiModelNames
from lmwrapper.batch_config import CompletionWindow
lm = get_open_ai_lm(OpenAiModelNames.gpt_3_5_turbo)
prompts = my_custom_function_for_building_my_prompts()
is_testing = am_i_currently_dev_testing()
predictions = lm.predict_many(
prompts,
completion_window=(
# Which endpoint we use is abstracted away
CompletionWindow.BATCH_ANY if not is_testing
else CompletionWindow.ASAP
)
)
for pred in predictions:
print(pred.completion_text)
# whatever extra processing...
If you send gpt-4o-mini small batches (~500-1000 calls), and it’s off-peak hours (after 8pm PST), I typically see return times under 30 minutes. I’ve never had it take more than maybe an hour to turn around.
I’ve sent probably 15-20 batch jobs so far, ymmv, but they don’t actually take that long to complete. Also, a test batch of that size with 4o-mini costs like 11 cents, so just start trying things out!