Gpt-3.5-turbo BROKEN: context-loading/response issue - emitting only functions never specified

ISSUE

gpt-3.5-turbo - all models

  • playground is sending an empty tools list instead of no parameter
  • With an empty tool list, AI model unexpectedly being forced to write function calls
  • AI model variant that “knows” functions is employed with tools and multi-tool wrapper
  • Symptom extends to model released before parallel tool calls.
  • Reproducible by API call, OK with no tools parameter.

It is sufficient to say that the AI model should not be sending to functions when they are never passed as an API parameter, nor repeated function calls as a default that cannot be stopped, and the API backend should be returning a “no tool recipient” that raises an error if the AI does generate a " to=" token instead of closing the assistant preamble …

This is not a new “system prompt your tool” model, nor a “mandated tool call” API call.

This is a signal that that this symptom of not stopping by the correct token is extending to all fringes of API use.

(Ignore the model-provoking prompt and that I show 16k “openai-internal” model in screenshots…)

No setup, somewhat constrained sampling by top-p, this function output is repeated in trials of gpt-3.5-turbo-1106 and gpt-3.5-turbo-0125 on Chat Completions, with an empty "tools":[] list being sent by playground:

What the doodle??

or:

The models are getting loaded up with tool stuff with no functions specified in the tools parameter, and is apparently constrained to sending to something the AI must fabricate by post-training. At top-p: 0:

I believe that an empty tools list was previously blocked, at least by SDK.

2 Likes

@_j greatly appreciate the detail here! Would you mind sharing this to support@openai.com? That way our team can address it head on in case this does not get traction on the community.

I do not see why someone with “OpenAI Staff” as a tag would solicit off-forum interactions for something that is clearly described here and was a platform-wide issue.

Instead, identifying bug reports as being such when well articulated, and escalating to the correct internal team, is the expectation. “Traction in the community” is not the goal and would not provide relief anyway.


Thank you for your interest, though.

It is sufficient to say that, in the week since my report here, an issue as significant as a large swath of AI models completely failing in the Playground was noticed by someone and fixed, and the calls with the same parameters selected now succeed (along with other Playground breaking bugs that were repaired, such as sending a request for encrypted reasoning to non-reasoning models).