I am getting this same error in our chat completions based agent.
The server had an error while processing your request. Sorry about that!
Few things about our case:
- We have a fairly large system prompt
stream
is set to true- The only other message is a
user
role, ‘hello how are you’, sent as formatwav
parallel_tool_calls
is set to falseaudio
is set to alloy/pcm16modalities
is text, audio- We have 6 functions defined in our tools array
What I’ve tried while debugging:
- Our non-multimodal agent works fine, with the same tool definitions, so it isn’t a parsing problem with the tools. Non-multimodal, meaning, prior to the recent updates to chat completions, I hooked up a pipeline that sent the audio to get transcribed, then to the old chat completions, then took the response to TTS, and played back the audio. That pipeline works fine, with all the tool definitions.
- If I remove three of the tools, the call completes correctly
- Seeing this, I thought it might be a total content length issue, but if I remove our system prompt, leaving the tool definitions, it still fails (at a smaller content length than when I remove three of the tools). This leads me to believe it isn’t a total content length issue, but rather something specific with the tools.
- I tried removing various combinations of tools, and it doesn’t seem related to a specific tool, but rather the total size of tool definitions.
Perhaps when used in a multimodal fashion, the total amount of space you can use for tool definitions is smaller? We did run into a few issues during development of the non-multimodal agent when we made the description field in a tool too long, for example. Maybe this error is some variation of that?