How to make the chatbot responde according to the files the user inserted in the vector store?

bernardo1 · May 13, 2024, 5:47pm

How can I make my chatbot respond according to the files the user inserted in the vector store?

I’m not using the assistant, just a simple chat completion, and I don’t intend to use the assistant.

I believe the procedure to be done is to take the user input, embed that input, and then use similarity search to compare the user input with each of the files in the vector store. But I don’t know how to make this comparison, and due to the structure of my project, I can’t import any library.

Diet · May 13, 2024, 6:04pm

Welcome to the community!

You typically compute the cosine similarity

You can just do it in a loop if you don’t have too many vectors to compare.

It’s just the dot product of two normalized vectors

bernardo1 · May 13, 2024, 7:50pm

But I can compare the input embedding directly with the vector store files? Or I have to “convert” the files in vector store to embedding first?

I mean, the vector store files are already an embedding?

Diet · May 13, 2024, 8:08pm

Well it depends on what you mean with vector store

If the vector store just consists of vectors, then yes.

You don’t need to re-embed your documents for every query - just the query.

You can think of the whole thing as a hash map. The embedding model is your hash algorithm.

bernardo1 · May 14, 2024, 2:27am

I’m referring to the pdf file that I upload to the files endpoint and then linked to the vector store.

From what I’ve seen, this process of processing the file and then doing the embedding is done automatically, I’ll even leave a print of the image where this is said.

So I want to know if, when getting these files from the vector store, they will still be embeddings so I can make the cosine similarity with the user input embedding. Or if I will have to embedding the files first and then do the cosine similarity.

I apologize if I ended up being confusing, or if my question is very stupid, this is the first time I’m dealing with artificial intelligence and the openai api.

Diet · May 14, 2024, 2:38am

I thought you didn’t want to use assistants?

the assistants file search and vector store is a tool for assistants.

https://platform.openai.com/docs/assistants/tools/file-search/vector-stores

using assistants may not be the worst idea if you just want to slap something together as a PoC.

I’m not using assistants, but my understanding is that you won’t need to deal with cosine similarity and all that, you just attach it to your assistant and OpenAI does the rest.

_j · May 14, 2024, 3:13am

Or more plainly: In assistants, a vector store is accessible only by a file search function the AI might call, and only in that endpoint.

You would have to make your own embeddings-based vector database for chat completions, and then for each input that you want to automatically inject knowledge, run an embeddings AI model call on the input (and perhaps more context or language transformation) to get an input vector back which you can use for an exhaustive search against all database entries, to return the top results in a format the AI can understand when placed into context, such as a system message “you have this additional knowledge automatically added for fulfillment of the most recent user input…”

It can then be automatic, be based on a quality threshold, use semantic or manual chunking techniques, not need an AI decision to call a tool or double the internal calls and costs, and all other ways you can be better than Assistants.

bernardo1 · May 14, 2024, 3:20pm

I understand, I didn’t want to use assistent, because from what I saw, it uses a lot of tokens.

In this case I would have to use my own vector database, right?

My plan was to use OpenAI’s own vector store, without necessarily using wizards, but from what I see that won’t be possible.

Topic		Replies	Views
Confusion with the vector storage option when searching for files API	35	9486	July 15, 2024
Creating my own vector store for assistant api API	2	803	July 24, 2024
How can i refer to a specific document in the vector store Prompting assistants-api	0	277	May 20, 2024
I want to use an openai assistant in my Unity project, but it must have a file search feature API chatgpt , api , assistants-api , file-uploads , assistants-files	1	371	May 11, 2024
Assistant API + LangChain: I want it to use my own files from my own vector DB API	1	1148	November 24, 2023

How to make the chatbot responde according to the files the user inserted in the vector store?

Related Topics