Keeping Assistants in a Box

Yeah, great point… also a versioning of prompts, so it can compare the quality of the models over time

1 Like

We have the perception that even when we keep prompts constant, we’ve been seeing a “drift” in how the Assistants API responds with an identical context. We are using gpt-4o-mini as our model. My guess is that OpenAI is making a series of changes to the “envelope” in which our assistants are running – perhaps to improve safety, etc., through additions and changes to the context. For example, our instructions used to do a nice job of limiting the length of a response. But over time, we are seeing that the response lengths are changing dramatically in some situations. Admittedly we don’t yet have a completely solid baseline established. In the context of this thread, though, it is this “escape” problem (where the assistant starts producing responses that are beyond its scope) has been the most difficult challenge – because that has been changing without any changes to our instructions.

We have already invested in automation for testing that a set of questions are producing good responses. But I can see that we’re going to need a lot more of this over time – and, especially, when we want to move the system to a different model.

1 Like

If you just want to use it for constant results maybe it would be better to get a deployment on azure then. I assume they don’t change the models there. Once deployed it stays deployed. Would really suprise me when azure did that tbh.
If you want a change because a new model version performs better you will need to redeploy.