Looking for best practices for using vector database + storing metadata + caching.
I need to embed continuously new documents into my vector database and want to make them searchable (the pages) and thus want to store somewhere the metadata but still be able to scale the application and not be limited by storing metadata in the vector database (like in pinecone)
I recommend starting with a simple JSON file. This will give you an easy, flexible, and forgiving environment for experimenting and figuring out what works for you.
500MB JSON file ~= 25,000
I haven’t had reason to migrate away from this for my personal notes since the 500MB/25K embeddings limit handles my requirements and then some.
how to convert json data to documents in langchain? Later I would be able to convert those documents into embeddings