I am creating application in which i have created list of documents and its content by open ai completion api,
i am stuck on like when user update single document content i want to find the updated changes and apply it on all documents.
i can’t call completion api again and again in small time limit so i used batch api but in batch api it’s not give response instant. it takes time to process. but i need something like i can call api that update all the documents content parallelly
If you have paid under $50 to purchase API credits in the past, you are likely relegated to “tier-1”, where you have a per-minute token rate limit of 30000 tokens (or less) on gpt-4o, and low limits on other models as well. This limits even single-user applications.
The number of total requests per minute is likely not being activated, though. A single API call also cannot use more than the per-minute token limit.
Thus, if you are getting an API rate limit error after several rapid API calls, you will need to space out the API calls made by a job so as not to exceed that per-minute limit. Simply a sleep(60) if you want to slow down your loop without monitoring how much you are actually sending, or a parallel processor that can understand the minute’s available token pool and track your usage by token counting and the response token usage report.
You can see how well the higher limit of gpt-4o-mini can perform the same task or the follow-up tasks. If you need more rate, and it has been over a week since your initial payment, you can make a new payment to bring the total paid over $50 (or to the next usage tier’s payment level). Note that prepaid credits expire in one year.