I am trying to follow the simple example provided by deeplearning.ai in their short course tutorial.
As per the tutorial following steps are performed
- load text
- split text
- Create embedding using OpenAI Embedding API
- Load the embedding into Chroma vector DB
- Save Chroma DB to disk
I am able to follow the above sequence.
Now I want to start from retrieving the saved embeddings from disk and then start with the question stuff, rather than process first 4 steps every time I run the program.
Here are snippets of code that I am using
vectordb = Chroma(persist_directory="embeddings\\")
The above code prints 188 which means the data is present, but how do I make use of it. Using below code
docs = vectordb.similarity_search(question,k=3)
I get following error
You must provide embeddings or a function to compute them
Any help on how to define the function or suggest the langchain API about embeddings.
I’ve been struggling with this same issue the last week, and I’ve tried nearly everything but can’t get the vector store re-connected after script is shut-down, and then re-connection attempted from new script using same embeddings and persist dir.
I haven’t found much on the web, but from what I can tell a few others are struggling with same thing, and everybody says just go dig into the langchain source code to figure it out.
Wish someone would just give an answer others could leverage
I just gave up on it, no time to solve this unfortunately.
The answer was in the tutorial only. Had to go through it multiple times and each line of code until I noticed it.
Here is what worked for me
from langchain.embeddings.openai import OpenAIEmbeddings
embedding = OpenAIEmbeddings(openai_api_key=api_key)
db = Chroma(persist_directory="embeddings\\",embedding_function=embedding)
The embedding_function parameter accepts OpenAI embedding object that serves the purpose.
Hope this helps somebody