Missing reference when importing from previous python library's "utils"

ModuleNotFoundError: No module named ‘openai.embeddings_utils’
with the the latest version of OpenAI; the below import statement is throwing above error.
from openai.embeddings_utils import get_embedding, cosine_similarity

The python library has been completely revamped.

The kludges that were available in “utils” are mostly no longer necessary, such as a function for outputting a generator as a dict.

Looks like it’s time to adapt the API reference into a more useful example…

from openai import OpenAI
client = OpenAI()

def multiple_embed(embed_input):  # only takes list, which can be single string
    veclist = []
    try:
        embed = client.embeddings.create(
          model="text-embedding-ada-002",
          input=embed_input, encoding_format="float"
        )
    except Exception as e:
        print(f"Embeddings failure {e}")
        raise

    firstvector = embed.model_dump().get('data')[0].get('embedding')
    for index in embed.model_dump().get('data'):
        veclist.append(index.get('embedding'))
    if not len(veclist) == len(embed_input):
        print("embeddings doesn't match number of inputs")
        raise

    cost = embed.usage.total_tokens
    print(f"embeddings: {cost} tokens, "
          f"{len(embed.model_dump().get('data'))} vectors")
    return veclist


input = [" cute kitten", " ugly cat"]  # 8192 max tokens total
vectors_list = multiple_embed(input)  # a list, with vector list for each input

Thanks for the response. I was able to embed the text and save it as a CSV file.
When the input comes, I would like to get do the cosine similarities matching in vector text data. I have been using OpenAI documentation for this POC, where i found the below code sample.

from openai.embeddings_utils import get_embedding, cosine_similarity

def search_reviews(df, product_description, n=3, pprint=True):
   embedding = get_embedding(product_description, model='text-embedding-ada-002')
   df['similarities'] = df.ada_embedding.apply(lambda x: cosine_similarity(x, embedding))
   res = df.sort_values('similarities', ascending=False).head(n)
   return res

res = search_reviews(df, 'delicious beans', n=3)

You can grab the utils as they were, and just adapt some of the dependencies.

For example:

import numpy as np
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
1 Like

Thank you, but still, the highlighted libraries are throwing errors. I referred to the link that was provided in your above response.

I tried it in anaconda notebook as well. I am getting below errors, Seems like some of the libraries are deprecated or removed in newer version ???

You need code-writing abilities and would need to explore the code to see what methods are being used and what imports they rely on, and then write similar portable version of the function for your code.

Got it. I am now able to build a file with only the needed functions. But as per openai documentation, I am trying to implement the text search like below
, and getting error as
AttributeError: 'DataFrame’ object has no attribute ‘ada_embedding’
The referral link is
https://platform.openai.com/docs/guides/embeddings/use-cases

df = pd.read_csv(‘filename.csv’)

df[‘ada_embedding’] = df.ada_embedding.apply(eval).apply(np.array)

res = search_reviews(df, ‘Sample Text’, n=3)

You’ll likely find the answer in: openai.datalib.pandas_helper

Another thing you can try is rather brute-force. Download the whole repo at that branch. Then, run a search-replace on “openai” and replace it with “myopenai” in every directory name, file name, and file, so you can put that “myopenai” directory in your python application’s directory and again import it.

Replace in files will replace a lot of actual API calls with broken calls, but that will tell you where you are still relying on old 0.28.1 methods that must be rewritten.

Better is to just learn what the utility function was doing and write a new one.

1 Like