We are using the Assistants API since it’s launch and consume multiple hundred thousand tokens an hour (tier 4). Since the downtime yesterday, we are still seeing elevated issues related to the following topics:
PDF files that are imported cannot be processed in 1/10 of the cases
Responses are highly unstable, although working fine for months
We have run IDs and thread IDs for reference, but we’re unsure who can help. Thank you in advance!
EDIT: It seems like the issue is now tracked as an incident!
For the second day, I’ve been observing a similar issue where the task returns a status of in_progress for a few seconds, and then responses cease without returning any errors. The result may come back after a delay of 10 minutes, or sometimes not at all. Previously, everything was working stably; the problem seems to be on OpenAI’s side.
Here we have the same problems of instability in the answers. Sometimes the assistant manages to analyze the file and, in many others, says that it can’t access the file. Since the file is actually there.
Additionally, the Assistant cannot understand the instructions, generating answers that completely escape the request.
My assistants api sometimes answers, sometimes it doesn´t.
Right now I´ve been trying in the playground and am barely ever getting an answer. Most of the time it´s just freezing after sending the prompt.
For me, it’s actually pretty reliable, aside from when the API has elevated errors.
The flakiest part is polling, especially when queries hang; however, these issues can be easily mitigated by forcing timeouts and effectively managing your API queries.
I will say, while this feature was originally aimed to be more of a “hands-off” approach, it has been way more involved for me to get it working right, and I’ve effectively created my own library to manage the entire assistant lifecycle.
However, that work is paying off as I start getting pretty consistent results.
Most of my current issues will be resolved with streaming.
Thank you for the advice, Danny. I implemented proper timeout handling with retries for querying the run status, and it resolved the issue. Now, if the API endpoint hangs and doesn’t return the next state in the current async call, I initiate a retry, to get actual status and everything continues smoothly.