File search + vector store confusion

hello community,

I am a little confused with some of the assistant v2 changes.

First can you confirm me that I understant correctly what file search and vector store does?

FILE SEARCH
It allows uploading a file to an assistant so it can be searched.

VECTOR STORE
From an uploaded file (via file search) it is stored in a vectore store for semantic search.

If the previous is correct, my confusion is, why are they presented as 2 different things? if you want to use RAG you need file search + vector store, right?

last thing, vector stores need to be allays external o assitant provides a default one?

thanks,

Gerard

assistants: an endpoint with individual assistants that are a collection of instructions.

files endpoint: where you upload any files to be used, with several purposes besides assistants.

vector store: the places where your chunked document files go. When you add uploaded files to a vector database, text extraction is performed, chunking is performed on the text, and embeddings metadata is added allowing semantic search on those pieces to find the most relevant results.

File search: the function the AI can call to submit searches and receive results based on similarity.


A single vector store can be attached to an assistant. It doesn’t have a default one.
A single vector store can be created when you attach files to a thread’s messages (which don’t appear alongside the message).


Retrieval-augmented generation to me is automatic retrieval first (based on the conversation), then generation that is augmented by that. Not what assistants have.

Search is just that: an AI “Googling” of your documents, without basis in keywords but in meanings, and returning larger pieces as the end result.

1 Like

@_j thank you very much for your reply.

A single vector store can be attached to an assistant. It doesn’t have a default one.
A single vector store can be created when you attach files to a thread’s messages (which don’t appear alongside the message).

If you attach a vector to the assistant, does that mean you no longer need to attach it to a message?
Is there a way to control when RAG have to be applied or not?

Thanks :slight_smile:

Remember: the AI has to invoke this search function.

If you know you have documents that a user is going to ask about, and they aren’t impossibly large, it is much better to do your own text extraction and place them directly in a message, so all the text is available, without double the model calls.

The vector store with a thread gives them a place where they don’t expire after they “scroll off” the converation due to length, but rather where they expire after 7 days inactove. And the full text is not available to the AI, just the most relevant search results.

1 Like

How else can the files endpoint be used? :eyes:

Besides that being where you upload batch files for queued chat completions processing, and training files for model fine tuning, the files that you upload with the purpose assistants can be made available to code interpreter.

# Upload a file with an "assistants" purpose
file = client.files.create(
  file=open("mydata.csv", "rb"),
  purpose='assistants'
)

# Create an assistant using the file ID
assistant = client.beta.assistants.create(
  instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
  model="gpt-4-turbo",
  tools=[{"type": "code_interpreter"}],
  tool_resources={
    "code_interpreter": {
      "file_ids": [file.id]
    }
  }
)

Although the file validation rejects a lot of file types you might want to use from their contents.

thanks @_j
Under our use case, customers won’t need to upload anything; it’s only us who need to improve the assistant’s knowledge with Retrieval-Augmented Generation (RAG) to provide better answers to all of our customers. Additionally, we use functions so the assistant knows when to interact with our backend.

Based on this context and your comments, I understand that I only need to set the instructions, tools, tool resources, and functions at the assistant level, and there is no need to do it at the thread and message levels. Am I correct?

If at any time the instructions change and a thread is already active, is updating the instructions enough for the thread to have the latest context?

The thread of a conversation will have the previous responses from the AI, and also the previous reponses of tools that show the AI requesting those tools.

Since every run has you pick a thread and an assistant, the behaviors of the AI are chosen at that run time, plus you have the ability to temporarily add additional_instructions or completely overwrite instructions for that run, besides using the latest version of an assistant at that time.

Thus, yes, you can continually update the assistant each run, or simply update shared behavior for new tweaks, and it is only a confusing prior chat that would keep the AI from talking like a pirate.

If the thread is actually “active”, in a run state - probably not a good idea to change assistants id behavior live, such as removing a tool while the AI is waiting for a tool return.

1 Like