I have been reading through the forum on embedding, saving and retrieving vectors and then using those retrieved embeddings and their context to answer queries.
I have been trying to build a simple web app for our internal users who work with legal documents (property searches so a lots similar documents for different properties). Following is my understanding of how I could implement a small proof of concept kind of web portal would be to.
- Extract and collect data from those legal documents. I have a JSON file created from those legal documents so my data is in well defined sections already.
- Get those sections for each report converted into embeddings and save those in a vector database (thinking about going with. Popular option of pinecone).
- Convert user query into embedding, find the most relevant match(s), get their string representation and use those plain text bits as context for my query and use completion to return an accurate and nice looking answer to my user.
I am slightly confused about a couple of bits and just wanted suggestions to validate my understanding and correct it if I need to.
- When I save embeddings for each chunk of each report, I save the text representation of that vector also in pinecone or do I use another storage solution like RDMS to save the vector and string relationship. Is vector storage also meant for saving plain text?
- When my user asks any question, I know which property(the report) they are currently asking about but to get the correct embedding for that property, do I make address/reference number of that property part of each embedding or save that as a separate field/column in pinecone index? For example, a user may query “list all planning charges”, do I add the property address with each query “list all planning charges for house number 1, street 1……” or is that gonna get me a lot of false matches as in every single embedding for each report, my address will match so my query vector finding call may even ignore the planning charges bit and instead return me “financial charges for house number 1, street 1….”
- Is there a way I can filter my results from vector storage based on property address if I save that as plain text on index for each embedding?
- Some times, the large reports have a lot of information for each section that means I can’t send all of the relevant embeddings to Open AI in one prompt due to char limit. Is the recommended approach sending one call per embedding content as context and then combing result of each call or should I merge embeddings to make fewer calls?
Apologies for the long post but couldn’t stop adding all this detail.