How to load all types of documents (.pdf, .txt, .docx, .csv, .excel) through document loader using Pinecone through Langchain wrapper?

I am into creating an interactive chatbot that can take inputs from multiple data sources like pdf, word file, text file, excel files etc. I am using Pinecone retriever with Langchain wrapper on top of it. When I go for DirectoryLoader using glob function, I’m unable to load other file types except PDF and convert it to vector embeddings. Need a way to load rest of the documents and process it further for embeddings.

hi , have you recieved any answer ? or you have find any solution ?

Hi welcome to the community, I think you will better get a response on langchain community. Also you can ask Mendable bot on there site, just press Ctrl + K to use.

image

You mean PineconeHybridSearchRetriever?

Hi , Actually we need to handle all the files differently while loading and while processing also ,because I was also stuck but then I got this solution and it worked.