Im doing around 7 requests per minute, each request having ±750 input tokens and ±50 output tokens.
My software seems to get stuck though relatively quickly and I don’t know if the issue is my software or OpenAI. Just would love to get an answer here so I know where to start digging
I have just talked with my programmer who responded
„You need to make these calls asynchronously for them to be processed concurrently.“ - we are already doing this and we couldnt see any errors related to calling chatgpt
If this is a non-realtime process on an existing data set, you might consider using OpenAI’s own batch processing - you send a file of chat completions requests in API JSONL format for a 50% discount, a 24 hour turnaround for a file with responses (or recently for -mini, a 24 hour “wasn’t run” often).
Without even knowing what language you are using, it is hard to guess at any faults in techniques (although fault one is a programmer calling the API “ChatGPT”).
Absolutely! Please tell me, I am always managing to make the system work for a few minutes until all of a sudden it stops for one hour and works again. Is there any limitation from OpenAI in that scenario when there is always a one hour pause in between it not working and working again?