I have an app that processed a batch of 4000 requests using gpt-4o. Each request is about 2000 tokens. As items are being added to a batch, when the batch reaches a certain size, that batch is processed to the server and a new batch is started until all items have been processed. In this case twelve batches were created automatically for the 4000 requests. Eleven of the twelve succeeded in the 1 that failed was the third one to be processed.
Also, I am a tier 4 with plenty of capacity, so this error message does not make sense to me:
Enqueued token limit reached for gpt-4o-2024-08-06 in organization org-mSPpvLbxkAGWzFFULRACeFkQ. Limit: 1,000,000 enqueued tokens. Please try again once some in_progress batches have been completed.
If I reached my limit then why did it continue to successfully process the other batches?
This seems to be a bug and can anybody kindly lend some insight to this.