Hello Community,
I was trying to implement the new Text Embedding Model “text-embedding-3-large” instead of the old “text-embedding-ada-002”.
Here is how the Code looked like before:
embedding = OpenAIEmbeddings(openai_api_key=api_key)
vectordb = Chroma.from_documents(documents=texts, embedding=embedding, persist_directory =pers_dir)
vectordb.persist()
Here is how it looks like now - only thing changed is the model since I specified “text-embedding-3-large” resulting in a 3072 Vector instead of a 1568 Vector:
embedding = OpenAIEmbeddings(openai_api_key=api_key,model="text-embedding-3-large")
vectordb = Chroma.from_documents(documents=texts, embedding=embedding, persist_directory =pers_dir)
vectordb.persist()
Then I inspected the vectordb Object using:
vectordb.embeddings.json()
and got the following result:
{"model": "text-embedding-3-large", "deployment": "text-embedding-ada-002", "openai_api_version": "", "openai_api_base": ...
The content for the model-key is correct, but what is the meaning of the deployment key ?
After that I inspected the length of the first Element/ Record to see how many columns it has which is basically the length of the embedding vector resulting from using the Embedding Model:
len(vectordb._collection.get(include=['embeddings'])["embeddings"][0])
The result is 3072, which is correct. But why does the value for the “deployment”-key differ from the one for the “model”-key when inspecting the vectordb-Object?
Thanks in advance for answers or inputs.