Large time to complete RetrievalQA invokation

gintellect · October 23, 2024, 2:15pm

I am very new to implementation of embeddings. I wrote a simple application which reads the text from PDF, chunk it in 500 words each and using OpenAIEmbeddings and Chroma.from_text method to create vectorstore on the disk.

Once the chroma index is created in the background on given set of documents, for every question, I create RetrievalQA chain using the vector store and then invoke the chain for a given question.

vector_store = load_vector_store(save_path)
llm = ChatOpenAI(model="gpt-4") 
qa_chain = RetrievalQA.from_chain_type(llm, retriever=vector_store.as_retriever())
answer = qa_chain.invoke(question)

Tested this with small dataset large datasets. This works pretty well functionally. But my observation is qa_chain.invoke method takes almost 1.5 mins to execute.

Can someone review and guide on what could be wrong here?

Topic		Replies	Views
LangChain+LlamaIndex taking too long to give a answer API api , langchain , assistants-api	0	1094	February 6, 2024
Speeding up Langchain/LLamaIndex + API calls API langchain	3	1747	November 10, 2023
Semantic text search using Embeddings in a web application API	1	764	December 17, 2023
Token use on Langchain PDF reader API	2	2673	December 23, 2023
Using ChatGPT 3.5 Turbo with Langchain is excessively slow API chatgpt , langchain	3	2884	October 21, 2023

Large time to complete RetrievalQA invokation

Related topics