Need Help with RAG and Embeddings

I wrote a very simple streamlit app that would upload a document trough Chroma and then let students ask questions of that document. The only issue was that any large document ran into token limits. This had led me on a LONG journey beyond Chroma, and now back, and I think I am close but I am still missing the boat. I know that if I feed my entire vector_store into the prompt I will get a token error so that means that I need to grab the top N results.

I have the following code - note that the retriever in chain.invoke will not work and that is where I need any help you might be willing to provide. Thank you!

docs = st.session_state[“loader”].load()
text_splitter = RecursiveCharacterTextSplitter(
separators=[“\n\n”, “\n”, ". “, " “, “”],
token_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0, tokens_per_chunk=256)
character_split_texts = text_splitter.split_text(”\n\n”.join(doc.page_content for doc in docs))
token_split_texts =
for text in character_split_texts:
token_split_texts += token_splitter.split_text(text)
embeddings = OpenAIEmbeddings(model=“text-embedding-ada-002”)
firstTime = True
for splitted_document in token_split_texts:
vector_store = Chroma.from_documents(splitted_document, embeddings)
firstTime = False
vector_store.from_documents(documents=[splitted_document], embedding=embeddings)
retriever = vector_store.as_retriever()
qa_prompt = ChatPromptTemplate.from_messages(
(“system”, qa_system_prompt),
(“human”, “{query}”),
chain = (qa_prompt | llm)
with get_openai_callback() as cb:
ai_msg = chain.invoke({“query”: question, “context”: retriever,
“history”: st.session_state[‘history’]})

1 Like