Constantly hitting server errors in Agent Builder

I have a mutli-agent workflow in agent builder. Not very complicated, series of LLM calls based on some if/then logic, generate content → generate code → render code using client tools function call.

I constantly get “server errors” when running the workflow. Often in the first content generation node using GPT-5 + medium reasoning. And its not just once or twice, seemingly every other time I try to go through the workflow it fails - in both playground as well as production. Its frustrating in production because it fails silently so there’s no action the app can take to retry as there is no status.

{
  "code": "server_error"
}

Anyone know what I might be doing wrong - or if this is a common issue others have also faced? If so, how have you (if you have) been able to resolve this? Any guidance on this would be greatly appreciate - the promise of ChatKit/Agent Builder seems awesome, but for now even to prototype and get a useful workflow functioning correctly it seems to be quite the uphill battle.

Really frustrating. I’ve got the whole flow working in Agent Builder - goes through the entire flow (albeit sometimes fails with the same “server_error”) - but in my ChatKit implementation, its failing on the first step. The trace error says “server error”, with no further details. Heres the trace id for reference: trace_8f36540d0dcc4cf6a84e1a167af03c1b

@OAI team any guidance? Right now this is unusable. I’m actively porting to autogen.