hi there, i used to have the code like “from openai.embeddings_utils import distances_from_embeddings”, but after upgrading the python openai lib today, it says this code is gone, looks like it has been deleted. Any suggestions?
Thanks!
9 Likes
embeddings_utils.py is gone
opened 01:09PM - 06 Nov 23 UTC
closed 01:55PM - 06 Nov 23 UTC
### Describe the bug
The previous version of the OpenAI Python library contai… ned `embeddings_utils.py` which provided functions like `cosine_similarity` which are used for semantic text search with embeddings. Without this functionality existing code including OpenAI's cookbook example: https://cookbook.openai.com/examples/semantic_text_search_using_embeddings will fail due to this dependency.
Are there plans to add this support back-in or should we just create our own cosine_similarity function based on the one that was present in `embeddings_utils`:
```python
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
```
### To Reproduce
Cookbook example cannot be converted to use v1.0 without removing the dependency on `embeddings_utils.py` https://cookbook.openai.com/examples/semantic_text_search_using_embeddings
### Code snippets
```Python
from openai.embeddings_utils import get_embedding, cosine_similarity
# search through the reviews for a specific product
def search_reviews(df, product_description, n=3, pprint=True):
product_embedding = get_embedding(
product_description,
engine="text-embedding-ada-002"
)
df["similarity"] = df.embedding.apply(lambda x: cosine_similarity(x, product_embedding))
results = (
df.sort_values("similarity", ascending=False)
.head(n)
.combined.str.replace("Title: ", "")
.str.replace("; Content:", ": ")
)
if pprint:
for r in results:
print(r[:200])
print()
return results
results = search_reviews(df, "delicious beans", n=3)
```
### OS
Windows
### Python version
Python v3.10.11
### Library version
openai-python==1.0.0rc2
alternative at https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/embeddings?tabs=python-new%2Ccommand-line
you can use this code snippet for that.
from typing import List, Optional
from scipy import spatial
def distances_from_embeddings(
query_embedding: List[float],
embeddings: List[List[float]],
distance_metric="cosine",
) -> List[List]:
distance_metrics = {
"cosine": spatial.distance.cosine,
"L1": spatial.distance.cityblock,
"L2": spatial.distance.euclidean,
"Linf": spatial.distance.chebyshev,
}
distances = [
distance_metrics[distance_metric](query_embedding, embedding)
for embedding in embeddings
]
return distances