How to load all types of documents (.pdf, .txt, .docx, .csv, .excel) through document loader using Pinecone through Langchain wrapper?

hunterr · September 27, 2023, 6:56am

I am into creating an interactive chatbot that can take inputs from multiple data sources like pdf, word file, text file, excel files etc. I am using Pinecone retriever with Langchain wrapper on top of it. When I go for DirectoryLoader using glob function, I’m unable to load other file types except PDF and convert it to vector embeddings. Need a way to load rest of the documents and process it further for embeddings.

kpathak1 · October 4, 2023, 6:44am

hi , have you recieved any answer ? or you have find any solution ?

Innovatix · November 3, 2023, 12:21pm

Hi welcome to the community, I think you will better get a response on langchain community. Also you can ask Mendable bot on there site, just press Ctrl + K to use.

You mean PineconeHybridSearchRetriever?

kpathak1 · November 20, 2023, 9:45am

Hi , Actually we need to handle all the files differently while loading and while processing also ,because I was also stuck but then I got this solution and it worked.

sagar.hande.work · December 18, 2023, 10:40am

@kpathak1 Can you please specify which solution worked for you?

kpathak1 · December 18, 2023, 11:30am

like you need to handle every document in a different way in python and then you need to do the process on it and for handling every document in python we can get ready made code on google, but for me I have created different program for .pdf extention , and created diifrent program for .csv, .docx , .txt extensions.

Topic		Replies	Views
Building a chatbot using Llamaindex, Langchain, and OpenAI API for document-based answers API	2	5935	June 12, 2023
CHAT-GPT Search API For Document Upload API	8	29611	December 12, 2023
Leveraging Pinecone Vector DB for Document Compliance Analysis API langchain , gpt4 , openai	3	405	March 22, 2024
Uploading a Excel data table API openai	2	6043	October 25, 2023
Is the OpenAI Embedding working well in the NodeJS? API embeddings	11	3995	March 6, 2024

How to load all types of documents (.pdf, .txt, .docx, .csv, .excel) through document loader using Pinecone through Langchain wrapper?

Related topics