Trying to embed the text data with csv file.
df[‘Description’] = df[‘Description’].fillna(‘’)
title = df[‘Title’]
des = df[‘Description’]
author = df[‘Author’]
def get_embedding(text: str, model=“text-embedding-ada-002”) → list[float]:
return client.embeddings.create(input=[text], model=model)[“data”][0][“embedding”]
embedding = get_embedding(des, model=“text-embedding-ada-002”)
TypeError: Object of type Series is not JSON serializable
Please help with this issue.
I tried to fix it with ChatGPT but could not solve this.
Diet
January 17, 2024, 3:52am
2
Hi! Welcome to the forums!
def has to be a json array of strings, a dataframe is a more complex object. have you tried turning it into a list first?
Yes, tried ‘tolist()’ and json.dumps, pickle.dumps.
Oh, it works after done ‘tolist()’.
But for now, I need to reduce the number of tokens…
Thank you so much.
By any chance do you know how to embed by dividing it into several times?
1 Like
Diet
January 17, 2024, 4:38am
6
namdarine:
Thank you so much
np
Yeah! instead of an array you can just pass the string to be embedded. Put that in a loop! don’t forget to add exponential backoff (double the wait time between retries with every subsequent failure) so you don’t hit the rate limiter too often.
1 Like