File attachments vs vector stores: Which offers better answer quality?

I was wondering whether there a difference (in quality of answers for instance) in attaching files to threads over using vector stores.
I’m saying that because i have the feeling that sometimes the Assistant lacks context for the file.
did you experience anything like this ?

1 Like

Hi @Ohmann, welcome to the community! That’s a great question, and one that often confuses people who are new to working with the Assistants API. Let me clarify how this works:

While you might be “attaching a file” in the appearance of API parameter when you give a file ID in messages for the tool file_search, the actual behavior is simply to add that ID to a thread’s vector store. This is combined with the Assistant’s vector store, where both vector stores are possible sources to be added to the results that the AI’s file_search tool will return.

So: there should be no difference. The attachment is still behind a file search tool the AI has to use, and it has no idea what kind of results will be returned from a search until you discuss those yourself in system messages.

Most importantly, the AI doesn’t get any sort of attachment of a whole file it can observe. The extracted chunked data is behind the search query, where only top rank results are returned.

Your application can be significantly improved if you tell the AI what kind of knowledge from files it will find behind a search query, so it uses the tool efficiently, although for user-provided files, giving file names or file summaries to improve searching and specialization is more product that you can develop yourself.

Hope that helps clarify things!

2 Likes

Many thanks for your reply Jay.
when you say “giving file names or file summaries” you mean in the context ?

the annoying situation that tends to happen is the following :
The user will upload a file, and then ask “can you summerise this?”. Semantically, “what do you think of this?” has no link to the file uploaded so it ends up looking elsewhere to respond or says it has no access to this file.

it makes a poor user experience