Need Help with RAG and Embeddings

ktola · April 2, 2024, 4:39pm

I wrote a very simple streamlit app that would upload a document trough Chroma and then let students ask questions of that document. The only issue was that any large document ran into token limits. This had led me on a LONG journey beyond Chroma, and now back, and I think I am close but I am still missing the boat. I know that if I feed my entire vector_store into the prompt I will get a token error so that means that I need to grab the top N results.

I have the following code - note that the retriever in chain.invoke will not work and that is where I need any help you might be willing to provide. Thank you!

docs = st.session_state[“loader”].load()
text_splitter = RecursiveCharacterTextSplitter(
separators=[“\n\n”, “\n”, ". “, " “, “”],
chunk_size=1000,
chunk_overlap=0
)
token_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0, tokens_per_chunk=256)
character_split_texts = text_splitter.split_text(”\n\n”.join(doc.page_content for doc in docs))
token_split_texts =
for text in character_split_texts:
token_split_texts += token_splitter.split_text(text)
embeddings = OpenAIEmbeddings(model=“text-embedding-ada-002”)
firstTime = True
for splitted_document in token_split_texts:
if(firstTime):
vector_store = Chroma.from_documents(splitted_document, embeddings)
firstTime = False
else:
vector_store.from_documents(documents=[splitted_document], embedding=embeddings)
sleep(60)
vector_store.persist()
retriever = vector_store.as_retriever()
qa_prompt = ChatPromptTemplate.from_messages(
[
(“system”, qa_system_prompt),
MessagesPlaceholder(variable_name=“history”),
(“human”, “{query}”),
]
)
chain = (qa_prompt | llm)
with get_openai_callback() as cb:
ai_msg = chain.invoke({“query”: question, “context”: retriever,
“history”: st.session_state[‘history’]})
st.session_state[‘costs’].append(cb.total_cost)

Topic		Replies	Views
How do I use ChromaDB is Create Embeddings API chatgpt , chromadb	0	1217	March 22, 2024
Use file with text-davinci-001 to increase tokens in prompt Prompting	13	2573	December 15, 2023
Seeking Advice: Uploading Large PDFs for Analysis with GPT-3 API API gpt-35-turbo , chatgpt , fine-tuning , api	7	7070	December 13, 2023
RateLimitError: Error code: 429 while running a RAG application consisting gpt-4oAPI,Pinecone vector store GPT builders chatgpt , pinecone	0	310	July 4, 2024
Token use on Langchain PDF reader API	2	2806	December 23, 2023

Need Help with RAG and Embeddings

Related topics