Keeping Assistants in a Box

jochenschultz · February 5, 2025, 9:08pm

Yeah, great point… also a versioning of prompts, so it can compare the quality of the models over time

kduffie · February 5, 2025, 9:46pm

We have the perception that even when we keep prompts constant, we’ve been seeing a “drift” in how the Assistants API responds with an identical context. We are using gpt-4o-mini as our model. My guess is that OpenAI is making a series of changes to the “envelope” in which our assistants are running – perhaps to improve safety, etc., through additions and changes to the context. For example, our instructions used to do a nice job of limiting the length of a response. But over time, we are seeing that the response lengths are changing dramatically in some situations. Admittedly we don’t yet have a completely solid baseline established. In the context of this thread, though, it is this “escape” problem (where the assistant starts producing responses that are beyond its scope) has been the most difficult challenge – because that has been changing without any changes to our instructions.

We have already invested in automation for testing that a set of questions are producing good responses. But I can see that we’re going to need a lot more of this over time – and, especially, when we want to move the system to a different model.

jochenschultz · February 5, 2025, 9:56pm

If you just want to use it for constant results maybe it would be better to get a deployment on azure then. I assume they don’t change the models there. Once deployed it stays deployed. Would really suprise me when azure did that tbh.
If you want a change because a new model version performs better you will need to redeploy.

Topic		Replies	Views
Prompt Regression Testing - API Usage Prompting api , prompt-engineering	10	407	February 14, 2025
Issues and training when updating the LLM model on a project GPT builders gpt-4 , azure	3	501	June 13, 2024
Has regular gpt-4 model changed for the worse by any chance? Community gpt-4 , hallucinations	12	1810	April 23, 2025
Custom chatbot says that it's developed by OpenAI API gpt-4	33	2232	April 2, 2024
Getting Frustrated - starting to feel OpenAI just isn't usable API	69	11827	November 24, 2023

Keeping Assistants in a Box

Related topics