Missing reference when importing from previous python library's "utils"

vsraman85 · November 27, 2023, 8:52pm

ModuleNotFoundError: No module named ‘openai.embeddings_utils’
with the the latest version of OpenAI; the below import statement is throwing above error.
from openai.embeddings_utils import get_embedding, cosine_similarity

_j · November 27, 2023, 10:24pm

The python library has been completely revamped.

The kludges that were available in “utils” are mostly no longer necessary, such as a function for outputting a generator as a dict.

Looks like it’s time to adapt the API reference into a more useful example…

from openai import OpenAI
client = OpenAI()

def multiple_embed(embed_input):  # only takes list, which can be single string
    veclist = []
    try:
        embed = client.embeddings.create(
          model="text-embedding-ada-002",
          input=embed_input, encoding_format="float"
        )
    except Exception as e:
        print(f"Embeddings failure {e}")
        raise

    firstvector = embed.model_dump().get('data')[0].get('embedding')
    for index in embed.model_dump().get('data'):
        veclist.append(index.get('embedding'))
    if not len(veclist) == len(embed_input):
        print("embeddings doesn't match number of inputs")
        raise

    cost = embed.usage.total_tokens
    print(f"embeddings: {cost} tokens, "
          f"{len(embed.model_dump().get('data'))} vectors")
    return veclist


input = [" cute kitten", " ugly cat"]  # 8192 max tokens total
vectors_list = multiple_embed(input)  # a list, with vector list for each input

vsraman85 · November 27, 2023, 10:32pm

Thanks for the response. I was able to embed the text and save it as a CSV file.
When the input comes, I would like to get do the cosine similarities matching in vector text data. I have been using OpenAI documentation for this POC, where i found the below code sample.

from openai.embeddings_utils import get_embedding, cosine_similarity

def search_reviews(df, product_description, n=3, pprint=True):
   embedding = get_embedding(product_description, model='text-embedding-ada-002')
   df['similarities'] = df.ada_embedding.apply(lambda x: cosine_similarity(x, embedding))
   res = df.sort_values('similarities', ascending=False).head(n)
   return res

res = search_reviews(df, 'delicious beans', n=3)

_j · November 27, 2023, 10:50pm

You can grab the utils as they were, and just adapt some of the dependencies.

github.com

openai/openai-python/blob/release-v0.28.1/openai/embeddings_utils.py

import textwrap as tr
from typing import List, Optional

import matplotlib.pyplot as plt
import plotly.express as px
from scipy import spatial
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.metrics import average_precision_score, precision_recall_curve
from tenacity import retry, stop_after_attempt, wait_random_exponential

import openai
from openai.datalib.numpy_helper import numpy as np
from openai.datalib.pandas_helper import pandas as pd


@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def get_embedding(text: str, engine="text-similarity-davinci-001", **kwargs) -> List[float]:

    # replace newlines, which can negatively affect performance.

This file has been truncated. show original

For example:

import numpy as np
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

vsraman85 · November 29, 2023, 7:04pm

Thank you, but still, the highlighted libraries are throwing errors. I referred to the link that was provided in your above response.

github.com

openai/openai-python/blob/release-v0.28.1/openai/embeddings_utils.py

import textwrap as tr
from typing import List, Optional

import matplotlib.pyplot as plt
import plotly.express as px
from scipy import spatial
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.metrics import average_precision_score, precision_recall_curve
from tenacity import retry, stop_after_attempt, wait_random_exponential

import openai
from openai.datalib.numpy_helper import numpy as np
from openai.datalib.pandas_helper import pandas as pd


@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def get_embedding(text: str, engine="text-similarity-davinci-001", **kwargs) -> List[float]:

    # replace newlines, which can negatively affect performance.

This file has been truncated. show original

vsraman85 · November 29, 2023, 7:10pm

I tried it in anaconda notebook as well. I am getting below errors, Seems like some of the libraries are deprecated or removed in newer version ???

_j · November 29, 2023, 7:26pm

You need code-writing abilities and would need to explore the code to see what methods are being used and what imports they rely on, and then write similar portable version of the function for your code.

vsraman85 · November 29, 2023, 10:34pm

Got it. I am now able to build a file with only the needed functions. But as per openai documentation, I am trying to implement the text search like below
, and getting error as
AttributeError: 'DataFrame’ object has no attribute ‘ada_embedding’
The referral link is
https://platform.openai.com/docs/guides/embeddings/use-cases

df = pd.read_csv(‘filename.csv’)

df[‘ada_embedding’] = df.ada_embedding.apply(eval).apply(np.array)

res = search_reviews(df, ‘Sample Text’, n=3)

_j · November 30, 2023, 2:49am

You’ll likely find the answer in: openai.datalib.pandas_helper

Another thing you can try is rather brute-force. Download the whole repo at that branch. Then, run a search-replace on “openai” and replace it with “myopenai” in every directory name, file name, and file, so you can put that “myopenai” directory in your python application’s directory and again import it.

Replace in files will replace a lot of actual API calls with broken calls, but that will tell you where you are still relying on old 0.28.1 methods that must be rewritten.

Better is to just learn what the utility function was doing and write a new one.

Topic		Replies	Views
ModuleNotFoundError: No module named 'openai.embeddings_utils' in new Python library API api	6	6503	December 2, 2023
Embeddings_utils / distance formulas - where did it move? API	8	6127	April 16, 2024
Importing embeddings_utils no longer working API gpt-4	2	2396	February 10, 2024
Where is the original "openai.embeddings_utils" in the latest version API embeddings	2	7983	April 16, 2024
How can i use OpenAI to identify my own products from my database Community chatgpt	8	819	November 27, 2023

Missing reference when importing from previous python library's "utils"

Related Topics