Assistants can't see files from threads, when vector store is provided in configuration


I noticed the strange behaviour when using assistants API with files vector store. It seems when the assistant is created with vector store attached, it completely ignores any files provided as a thread vector store, or message attachments.

Steps to reproduce

  1. Create vector store with one or more PDF files attached
  2. Create new assistant with custom instruction (like “Read the name of every file you can access to”)
  3. Create a new thread.
  4. Run assistant’s thread. Assistant will list files from vector store (Correct).
  5. Create a second vector store with some other pdf files and attach it to the thread
  6. Send another message (e.g. “What files can you access?”)

GPT will still see just a first vs attachments. Even if I keep asking about files from second vs it will inform it does not see them. Once, it told me it sees one file from second vs, but reference still pointed to the first file.

I noticed this issue using API, but can reproduce in playground as well.
I suppose this is not a correct behaviour.

Some screenshots from playground below:

1 Like

I keep asking, and uploaded another file in a message.
Here is the result:

It is confusing.

It has files in search, code interpreter, threads.

The embedded chunks of attachment vector stores and thread vector stores are all added to the same myfiles_browser that now only has a search method.

Thus, if the AI was making a query that could return max 20 chunks, it could have returned all of an attached 10k token document, but vs a large number of knowledge files, the chunks of attachment will be lost in those of higher similarity from knowledge chunks.

The results of “seeing files” are displaced by other files.

Thank you for your answer.
However, I still don’t understand the issue. Files I uploaded are 9 pages total. It does not exceed the token limit.

How can I use both Assistant level vector store and thread or message attachments in a single run?

If you need 100% understanding of one document, you will need to pass it as “additional_instructions” in plain text for the turns where all must be visible.

So you’re saying upload the files, attach them to the message, then use some sort of prompt engineering in the run call?

We are trying this with no luck. We’re going to try to attach the file ids to the message object, retrieve the thread object and get the associated vector store, then pass that as a override parameter to run.create.

Attachment of a file id to a message just creates another vector store.

I’m talking about:

  • writing your own code to extract searchable text from PDFs
  • verifying the quality of the text - which you cannot do with assistants
  • place the full text as instructions or additional_instructions

Attaching even your own text to a message is not ideal, as you cannot tell when it will be expired or ignored by the thread management and need to be refreshed.

Or the entire forum being problems with using Assistants should give you enough insight to avoid this endpoint entirely.

Ha ha don’t make me cry.

I agree that some sort of validation is needed to make sure the quality of the text is good. I wrote up the sequence that’s needed here yesterday. I guess this is the price we pay for having Langchain RAG pipelines abstracted away?

So upload file, then use get-content to pass it through a completions endpoint for validation/summarization, if good attach file to the message and then pass the validation/summary to the assistant as additional instructions? That will at least let you know whether the file checks out.

Having a smallish documents tokens mixed in with a full-sized or multiple vector stores is still a problem, I don’t see any way around disconnecting the datastores for the run then reattaching, but so much can go wrong there, having different agent would be better I think. Not really sure this is Assistants problem per se as I imagine this affects any RAG flow.

Thread expiration is an issue that we can design around by our resume thread ux. We can tell which user messages had an attachment, and we can query the thread’s attached datastore by file ID to see if it’s still there, at which point we can error to the user accordingly or even re-upload the file from our custom data layer.

We may even modify the file upload flow to give users option to choose between a default thread upload or attaching to the assistants default vector store.