Assistant in Beta V2 calls other assistant file-id in annotation

So I have my application flow setup to allow for the selection of one or more assistants in a single thread, with the ability to add/remove assistants at anytime during the thread.

Some assistants have file_search and some do not. The ones with file_search already have the vector_store_id defined in the assistant. No additional vector_store_ids are loaded to the thread. In a thread with two assistants using file_search, the second assistant to respond will respond with the correct data, but the annotation will display a file_id from the first assistants vector store.

I am not sure if the assistant is actually reading that vector store or if the assistant is reading its own and the annotations are just displaying incorrectly. I am trying to do further testing to see if I can get the second assistant to produce a response that uses data from the first assistants vector store.

Previous assistant files attached to retrieval were automatically migrated and are mapped into vector-stores that can be used by simply using the beta2 API calls with vector store enabled. Then the assistant would also have a vector store as part of its enabled internal tools.

The migration docs talk about this, with just enough information to leave you scratching your head.

What is used on your assistant created through v1 API is based on your API https request header (or library module verson you use that selects that).

Ya I got it working correctly as far as I can tell as long as I interact with a single assistant per thread.

But in this case there are two assistants in the same thread, each with their own vector store associated. And when the second assistant provides a response, the context of the answer appears correct, but it will source chapters/sections from a file_id from the first assistant.

I am continuing testing and it seems like the context always remains correct, but if it references a section/chapter or provides annotations it always pulls from the first assistants file_ids instead of the file_ids from its vector store. It even will state to me in a follow up response that yes the references and annotations for some reason are from a different document even though the context appears correct

The thread contains previous tool returns also.

If the file search tool dumped 16k of documentation into the chat history, the new AI will still see that.

You can experiment and see if a no-search assistant also sees and can answer from the previous searches, as that would seem to be an indeterminate case, where those chat turns may or may not be muted by OpenAI.

1 Like

That is some great insight.

I did some more testing and have it narrowed down a little further now.

I tested your suggestion and could not get a no-search assistant to search the documents, but it could restate information from previous assistant’s responses in the thread.

I also tested having a file-search assistant follow up a no-search assistant and it was able to search its files as normal.

And finally I tested two search enabled assistants one after another, but they were addressing completely different topics. With that setup it was clear the second assistant’s response in the thread was always trying to search the first assistants documents, and states it can’t find the answer. But if the same question is asked to the same assistant in its own thread it works as expected.

My conclusion is that currently a thread cannot support more than one file-search enabled assistant.

My guess is that, to make this work I will need to load the vector_store_ids to the thread directly instead of assistants

1 Like

This sounds expected. To clarity, the second assistant isn’t able to search the first assistant’s vector store. Instead, what’s happening is that the raw chunks that were extracted from the first assistant during the previous Run are still part of the context in the Thread. And so, those extracted chunks are still usable by the second assistant when it does it’s Run.

Do you have a use-case where you’d want the second assistant to not know what the first assistant extracted from it’s vector store? And if so, what’s the use-case?

Thanks for your feedback!

2 Likes

Thank you for the clarification! It makes a lot more sense now.

In my use case my assistants are each defined to an Engineering Standard, Code, or Manual. In most cases I would say that it is good that it has access to the previous assistants chunks as it may have relevant information to the user input. However I can already identify several use cases where it may be important for the second assistant to not have access to the first’s chunks, but still have the previous assistant’s response as context.

One example would be if during a thread the user requires one assistant for a specific code answer and gets it, but then realizes there is a relevant follow up question that is more appropriate for a different assistant to compare the original answer to the follow up answer from the different assistant’s code/standard/manual. In this case the user’s input will be similar and it may find the most relevant data in the first assistants chunks. But the user will be looking for a comparison to the other assistants vector store data.

As further clarification after more testing I found more relevant results.

In cases where the second assistant is added to a thread that includes chunks from another assistants vector store file_search, the second assistant always seems to either say it cannot find the relevant data or references the chunks from the first assistant.

Even when the second assistant is prompted to only search the relevant vector store is can never seem to retrieve chunks from its vector store. In those cases the response of no relevant data is returned. The other cases appear to be displaying the correct context in the response (most likely due to the system prompt), but the annotations will always come from chunks of the first assistant.

Also as a follow up to my statement regarding loading the file_ids to the thread vector store was correct. When having all relevant files for all assistants in the thread in a single vector store, each assistant will return responses containing the relevant data from their speciifed code/standard/manual

It makes sense that a thread’s single vector store follows the thread to any assistant that runs on it.

There is a particular part of the documentation that may come into play, but the “why” behind it isn’t laid out.

When a vector store expires, runs on that thread will fail. To fix this, you can simply recreate a new vector_store with the same files and reattach it to the thread.

It may be that the “file_search”: {“vector_store_ids”…} of a thread then refers to particular ID that no longer exists, and they made the AI not continue silently on this error. They are telling you to replace the ID with a new one, and removing one with modify thread should also allow you to continue, I think.


I think the actual experience here is that:

  1. The assistant AI model doesn’t like to keep searching on the present tool after it has a long context or already sees one search result;
  2. There’s no information about the switcharoo to let the AI think different results may be possible;
  3. If you remove the file search and its injected tool specification completely, the AI is going to have less insight into how the document text got there.

Maybe I’ll play with it when I have $1 questions.