Is original data be stored in vector db?

y.jafari.cs · March 5, 2024, 5:21pm

For example, when we upload a 1 GB movie to the vector database, is the entire movie stored in the database or only some metadata, which is the same vector, is stored in the database and referenced to that original file from the database?

Macha · March 5, 2024, 7:29pm

Hey there and welcome to the community!

Which database are you using?

y.jafari.cs · March 6, 2024, 10:50am

i dont know exactly, chroma or milvus.
which one is better

vb · March 6, 2024, 11:17am

Hi!

The original data has to be stored somewhere in order to know what the vectors are actually referring to. You can either add it as metadata to the embedding in the vector database
or
add a reference to another database where the original data is stored as metadata in the vector db.

y.jafari.cs · March 6, 2024, 2:51pm

hiii
So you say that for example a 1 GB movie is not stored in the database?
Rather, it is stored somewhere like object storage and then a reference is given to it in the vector?
I read somewhere that data (for example, a movie as an input to an artificial intelligence system) is broken and stored in the database in chunks.

dignity_for_all · March 6, 2024, 3:21pm

Converting video to vector data is not commonly done. In theory, it is possible to convert each frame of a video into an image and then vectorize it. However, this would require a large amount of computation.
And it is very difficult to vectorize video efficiently.

Even in the case of text, once the original data is converted to vector data, it is not possible to restore the original text from the vector data.

So, it is necessary to store the original text data separately from the vectorized data during the embedding process.

Even if the video is converted to vector data and stored in a vector database, it is useless if the video itself is not stored separately from the vector data.

vb · March 6, 2024, 4:29pm

Yes, that’s correct.
The embedding vector is ultimately comparable to a summary of the chunk you embedded.

I suggest you spend some time looking into the process. You will then be able to understand the issues with video embeddings.

Topic		Replies	Views
Can vector base data be stored in chinese? GPT builders	1	63	December 5, 2024
Vector Database that can embed new data continuously Community vector-db	5	3871	January 24, 2025
More reasons for metadata in vector store API vector-db	6	1748	May 26, 2024
How do you tag data correctly? API embeddings , chatgpt , vector-db	8	4077	December 16, 2023
Understanding Vector Database API api	4	8439	June 5, 2023

Is original data be stored in vector db?

Related topics