In our system, we use a single assistant for a large number of separate users. But each user is given a separate thread. We saw a case where one user started sending requests in chinese and the responses were also in chinese. The RAG content is in english. All of that is excellent.
The problem is that we saw where a separate user in a different thread asked a question around the same time in english and got a response in chinese.
From what I read, there should be NO shared context between different threads for the same assistant. Does anyone know what might be going on? We checked carefully and there should not have been any chinese in the second thread context at all at the time of the chinese response. This second thread was a brand new thread (using createThreadAndRun).
This is one of many thousands of threads that have happened over recent days. We do not keep the full thread details so can’t give you the full context. There is also file_search involved, so can’t share the entire RAG content, but it is mostly in English and has no Chinese content.
Recall that this is a simple thread. We used createAndRun. The assistant instructions are in english and describes a particular piece of content based on the request for the assistant to find and summarize. That content is in English. The prompt is a simple sentence describing the URL and title of that content.
The assistant found the correct content and summarized it in Chinese. If I try to reproduce the same scenario now, I only get responses in English.
I suppose that we should be storing thread IDs along with that data. But that wouldn’t give us access to the full context – unless we store the file_search results as well – which could be voluminous.
Anyway, I wondered if anyone was aware of spillover between threads because of caching, for example. If we continue to see this happening then we will need to detect and instrument so that we have the full data that you are suggesting.
Do you delete your threads? Or are you saying that you have no mapping of thread ids to the user? If it still exists and comments can be added still then you definitely have the full thread details.
Let’s say that the cache system had a poor hash lookup, able to use the wrong kv context. How would I counter that?
I would want to break the system so it doesn’t work cross-customer.
If it was in Assistants, I would do that by using the additional_instructions run parameter. Place a hashed nonced timestamp there with “chat session start: {date}; chat session id {random}; user id: {id}”. Enough to break any cross-customer cache that might extend further past where uniqueness ends.
Shouldn’t be necessary, but then again, you are using Assistants, which breaks over and over.
We don’t delete our threads. And we do maintain the current thread ID for each user. But, unfortunately, these threads get periodically cycled – particularly when a user asks to start a new context. However, given this discussion, we’re adding a bit of code to keep that full history.
Threads should not get mixed up between different users. While it is possible that there is a bug in how the backend of the Assistants API handles threads, it is also possible that an edge case on your side is causing this issue.
In situations like this, I usually check how many similar reports we have received. While this behavior is not unheard of, it is rather uncommon.
My suggestion is to run tests on your codebase to identify the cause, implement a safety mechanism as @_j suggested, or at least reproduce the error so OpenAI can investigate and fix it.
Thanks. We have already implemented @_j 's suggestion and reviewed our records looking for another example. We have tried to reproduce this but, so far, have been unsuccessful. With our new instrumentation, we should have the data needed to link back to a specific thread and run in case it happens again.