Hi,
I have 300+ PPTX files and I want to create a chatbot which queries the PPTX file data. Since these PPTX files are large, I decided to use the following approach:
Read all PPTX files and generate summary of each PPTX file.
Store the summary of each PPTX file in vector database along with source document metadata.
Query the vector database on the basis of user query
Pass the query and returned documents to LMM to get the final output.
Return the final output and the source documents to user.
I am using UnstructuredPowerPointLoader to load the PPTX files and create a summary of each file using load_summarize_chain. The chain returns me string.
How can I store the output of load_summarize_chain in vector database (chroma).
Also please let me know if this approach is correct. Any sample code example will be really helpful.