Getting better results when using Assistant in playground versus using the API

Hi, I’ve noticed that we are getting different results when using the assistant in playground vs the api.
It seems that the playground is using a better version of code/model which is not accessible from API.

Some things we’ve noticed:

  • UI responds with function calls more frequently than API. When asking the same question from API, it could result in a text response which includes the JSON of the function in the middle of the text.
  • UI follows instructions better for complex questions and can make more than one function call, where API mostly results in only one function call
  • When asking a question multiple times responses we get from UI are more consistent than the ones we get from API.

Has anyone else noticed issues like this ? any suggestions on how to address these issues ?

Similar question was asked some time ago too Different responses assistant playground vs api - using same assistant id


I also experienced this recently.

This was driving me crazy, I got wildly different results from the playground and API. I noticed that in the playground I would easily clear the thread to test my new changes (and token count). But when using the API, I was using the same thread. So all of my tweaks and tests remained in the context window, which I believe caused the API response to differ greatly from the playground response.

I got much more consistent results once I started opening a new thread/deleting the old thread. Now I create a new thread whenever I’m testing changes on the API side and it is working better. I’m using model gpt-4-0125 preview.

Hope this helps!

1 Like