Confusion with the vector storage option when searching for files

Recently found cool updates from Assistant API. Now what I really don’t understand from here

OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries.

Than in code samples at the same pages they point out that you first upload files and then separately create vector stores.

So my questions:
1. If I need to create vector store separately, in this case how will the search in files work, if I just add files (without creating vectors)?
2. When I add files manually through admin panel , are vectors stores automatically created? Or I just need to add them programmatically afterwards (for files, which were previously manully uploaded)?

Thank you in advance!


You upload a file, it creates File ID. After that you create a Vector Store, it creates a Vector Store ID. Then you link your file with Vector Store. Then you take the file and pass it when you create the assistant like this:

curl "" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "OpenAI-Beta: assistants=v2" \
  -d '{
    "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.",
    "name": "Math Tutor",
    "tools": [{"type": "file_search"}],
    "model": "gpt-4-turbo",
	"tool_resources": {
      "file_search": {
        "vector_store_ids": ["VECTOR_STORE_FILE_ID"]

Then you create Thread and Message and RUN the ThreadID with AssistantID. This is how Files are attached to the Assistants in Assistants V2.

Let me know if it answers your question.


Thank you for your efforts explaining.

Assuming the vector store is already all set in the admin, do we need for each API request with v2 add the vector store file id as well?

Also, When creating a vector store in the admin it seems it can’t be attached (“used by:”) to an existing Assistant, creating a new one seems to be the only option. Is that correct or am I missing something?

1 Like

OK, after some fiddling I have found out that you add the Vector store to the Assistant via the Assistant menu → TOOLS → “File search” (previously “Retrieval”) → Add files → switch to “Vector store”.

However, you cannot remove the files or VS following the same path, removal needs to be done in the “Storage” menu now.

My remaining questions are:

Is it still better to put PDF files into JSON first? Or the Vector store/Assistant will store the files in the most efficient form anyways so the entry format does not matter?


Did you get what is the advantage of using vector stores over using just files? Does it provide some extra capabilities or does it make the search more qualitative or increases the granularity of search or what?

p.s. thank you once more, I really like how you explain things

1 Like

@apris OpenAI has changed the way to manage files for RAG, this was introduced in the last 2-3 days on the v2 of the Assistants.

Before it was called retrival files (v1) now it’s called file_search (v2). In v1 you only upload the files and the magic happens in the backstage but had the limitation of only working with 10 files up to 512Mb each file.

In v2 they introduce a step in between, files can be shared across multiple Assistants, but you need to ultimately use vectors for the Assistant. Vectors can be generated by one file or by a concatenation of several files. Either way the step is that you always have the upload the file, then you need to generate the vector of that given 1 or multiple files.

BTW as a general topic, and the reason why I’m writing here is to comment that it seems we found an issue where the vector / file / files is not used by default. So we escalated a issue to OpenAI to look into that.

It seems that despite you have generated and assigned the vector to the Assistant is not using it as default, as it was in v1. The only way to use that is to force it in the prompt. So I hope is a temporary glitch.


I have switched to v2 (created a vector store with my files) and now the same answers are requiring 10000 more prompt tokens. I can’t notice any difference other than that, so no, I am not sure what is the advantage of vector stores -appart from using up to 5000 files.

I agree with you here. Just finished experimenting with it and it’s quite expensive to use.

No, you need to create the vector store only one time and when you create an Assistant, you have to provide the vector store Id, that too one time. Then you can run any number of threads on that assistant.

Yes, it seems like it.

I think you don’t need to convert the PDF to JSON. I have my assistant running on PDFs and its working fine.

@apris I think I find Assistant V2 more powerful than V1. I think using vectors changed the whole level of Assistant.

I created a vector store and attached a file to the store in admin.
I then attached vector store to an assistant.
I then create a Thread and Message and run. When I view the thread in admin, it seems to be creating a new ‘File search stores’ every time with a file size of 0 bytes and does not seem to be using the Vector store attached to the assistant.
Is this a bug or am I missing something here?


Mapping files ID to vector storage ID needs improvement. fails for the entire mapping operation if there are unsupported formats and later there is no way to map uploaded files, it reuploads the same files

By the

Can I force using vectors through assistant instuction like this:


You are an Assistant tasked with responding to user questions about a product/service. When respond to user questions, you strictly rely only on the information provided in attached files. You never invent any information. If a query cannot be answered through the available information, you respond with: I don’t know.

I found that when the temperature is low (like about 0) the price is almost the same as if you put all your data into asssistant instruction. But when you increase temperature (at least make it 1 for example), it becomes several times smaller (about 10 times in my case, but I used just file with 140 lines, which is pretty small, I guess with bigger files advantage is more significant).

Thank you for all your useful replies @MrFriday.

Regarding the PDFs - they seem to be working fine. However, reading them consumes an awful lot of tokens so I am looking to improve the efficiency here - and therefore am curious about the performance of JSONs vs. PDFs.

Anybody has any observable experience?

PDF files have document text extracted from them.

So they have a similar impact on your wallet as a text file, but you could ensure data is presented in the best way for AI to understand when you prepare your own .txt instead of haphazardly relying on PDF and someone else’s searchable text reader.

So: a different file may improve the answering capability, but it is unlikely to reduce the budget.

The only technique one might employ is to produce your own text files that are smaller than the 800+overlap of the chunking. Then unless there are unexpected concatenations, a maximum 20 chunks that correspond to 200 token files should have lower maximum cost.

1 Like

From my noob level understanding, one advantage of vector db vs “just files” is the uniformity of search success across the database whereas “just files” are prone to needle in the haystack problems. In other words, the ability to find X within the haystack is not uniform across the entire file. That is probably a very noob way of trying to describe the issue. Apparently not so much of a problem if you’re reading files with gpt-4-V.

I attached the vector store to an existing assistant. It is managed through the ‘add files’ button. If there is a vector store attached, you can detach it. If no store is attached, you can select it by its id (not its name). However, they have a link to open the list of stores in your account.

I am not sure if this was answered or not but playing around with the new Vector store I found that in order to invoke the vector store you have to pass it through the thread api. I don’t believe it defaults just because you associated with the assistant. I actually prefer this architecture because I could see a use case where you would want to have different Vector stores and only make one available via one thread call and another via a different thread call and it seems to me like this is the start of open AI introducing some level of control over who can have access to what Vector store. Anybody else seeing this?

@Pacho No, that’s not the case. I’ve been using Vector Stores only with Assistants, means, I only pass the vector stores ID to assistant and not Thread. It’s working fine in my case. Also, you are right that there’s a use case where you have multiple Vector Stores and you want Assistant to traverse through only specific vector store and for this exact case, there’s an option to pass vector store Id to Thread.

1 Like

Ok. So when you “pass it” to the Assistant - what end point are you using? I connect the VS to assistant when I originally create the Assistant, but I only use the Thread/Run endpoints after that.