Problem
We’re using the OpenAI Assistants API (Node.js) with a polling loop to wait for run completion. In production, we noticed that some runs — particularly in longer sessions with many back-and-forth messages — were taking 60–90 seconds to complete.
Our backend was configured with a 60-second polling timeout, which caused a run_timeout error and returned a 500 to the user even though the run eventually completed successfully on OpenAI’s side.
What we tried
We increased our polling timeout from 60s to 150s using an environment variable:
const TIMEOUT_MS = parseInt(process.env.OPENAI_RUN_TIMEOUT_MS) || 150000;
while (!TERMINAL.includes(runStatus.status)) {
if (Date.now() - startTime >= TIMEOUT_MS) throw new Error('run_timeout');
await new Promise(r => setTimeout(r, 1000));
runStatus = await openai.beta.threads.runs.retrieve(threadId, run.id);
}
This solved the immediate issue. But we feel polling every 1 second for up to 150 seconds is not ideal.
Question
- Is
runs.stream()a better long-term approach to avoid timeout issues entirely? - What causes runs to take so much longer in sessions with more messages?
- Any production recommendations for handling slow runs gracefully?
Thanks