Embeddings_utils / distance formulas - where did it move?

The latest python library openai no longer contains the embeddings_utils. So this breaks:

from openai.embeddings_utils import (get_embedding, distances_from_embeddings, indices_of_nearest_neighbors_from_distances)

Where can we find the distance functions now

  • distances_from_embeddings
  • indices_of_nearest_neighbors_from_distances
10 Likes

I’m also using distances_from_embeddings and now looking how to migrate over to new API.

I have the same problem. Hope somebody can help us. Get_embedding, cosine_similarity… they just don’t work.

i have the intuiton that they deleted those embedding_utils from the library…

I was able to find the module in version 0.28 of openai (current version is 1.1, I think). I downloaded the module and will simply incorporate into my code but this is very odd why OpenAI thinks distance calcs are not needed for sorting and selecting the most relevant content vectors. Strange.

I solved it with the following:

from scipy.spatial.distance import cosine

then (I use pandas df to store my embeds)
df["distances"] = df["embeddings"].apply(lambda x: cosine(q_embeddings, x))

But it could be any other similarity algo of your preference.

4 Likes

Thanks – I’m running into this as well and it’s screwing me over as I’m not as dialed in as a coder as most users of these modules. We only encountered the problem because my co-worker upgraded OpenAI modules on the computer and found we can no longer call cosine_similarity or Get_embedding – at this point I am not upgrading until I understand this breakdown. Can you give me a sense of how you “downloaded the module from version 0.28” but still ensured you could call it in Python? I’m a quick study, but a bit more of a novice using Python. Thanks!

You can find the old version of the file on github:

Place embeddings_utils.py into your project.

Change the import to be:
from embeddings_utils import *

The problem remained that there are calls to openai inside this module, so I had to remediate those functions using the new client = OpenAI() syntax.

I deleted all the functions except the 4 I needed.

1 Like

Thank you!
Such as ::

 # Get the distances from the embeddings
from scipy.spatial.distance import cdist
df["distances"] = df["embeddings"].apply(
        lambda x: cdist([q_embeddings], [x], metric='Minkowski'))

Use this code snippet.

from typing import List, Optional
from scipy import spatial

def distances_from_embeddings(
    query_embedding: List[float],
    embeddings: List[List[float]],
    distance_metric="cosine",
) -> List[List]:
    distance_metrics = {
        "cosine": spatial.distance.cosine,
        "L1": spatial.distance.cityblock,
        "L2": spatial.distance.euclidean,
        "Linf": spatial.distance.chebyshev,
    }
    distances = [
        distance_metrics[distance_metric](query_embedding, embedding)
        for embedding in embeddings
    ]
    return distances