Im using assistants and the run status says ‘failed’, but when i print run I dont see any errors. Here is what the run looks like: Run(id=‘run_gvIWhRWSWzLuJbq8eEqWVB2H’, assistant_id=‘asst_9vVVph6XgZz6Da8JQlnJa2X4’, cancelled_at=None, completed_at=None, created_at=1706120995, expires_at=1706121595, failed_at=None, file_ids=, instructions=‘Whats 9 + 21’, last_error=None, metadata={}, model=‘gpt-4-1106-preview’, object=‘thread.run’, required_action=None, started_at=None, status=‘queued’, thread_id=‘thread_V0W3NtXrRMIxvFhM4F3Z7sG6’, tools=[ToolAssistantToolsRetrieval(type=‘retrieval’)], usage=None).
Note the status. Not “failed”. You have to continue to poll the status of a run until it changes from queued, in_progress, to completed. Then you can obtain the result from the thread.
The models each have their own rate limit bins. However it was previously reported that going well over one of the “per minute” limits (as can be done when you don’t specify how big the output will be when making many requests) could also shut off the other models until the overage was refreshed downwards.
(my own rate limits are a bit too high to explore this any more without spending $5.)
keep in mind that you need to ‘sleep’ between polling in order to avoid hitting the limit. For assistants 1-2 seconds are appropriate. But at least something like .1 or you will hit the limit quickly