Embedding model type error

namdarine · January 17, 2024, 3:43am

Trying to embed the text data with csv file.
df[‘Description’] = df[‘Description’].fillna(‘’)

title = df[‘Title’]
des = df[‘Description’]
author = df[‘Author’]

def get_embedding(text: str, model=“text-embedding-ada-002”) → list[float]:
return client.embeddings.create(input=[text], model=model)[“data”][0][“embedding”]

embedding = get_embedding(des, model=“text-embedding-ada-002”)

TypeError: Object of type Series is not JSON serializable

Please help with this issue.
I tried to fix it with ChatGPT but could not solve this.

Diet · January 17, 2024, 3:52am

Hi! Welcome to the forums!

def has to be a json array of strings, a dataframe is a more complex object. have you tried turning it into a list first?

namdarine · January 17, 2024, 4:09am

Yes, tried ‘tolist()’ and json.dumps, pickle.dumps.

Diet · January 17, 2024, 4:24am

This works for me:

namdarine · January 17, 2024, 4:36am

Oh, it works after done ‘tolist()’.
But for now, I need to reduce the number of tokens…

Thank you so much.

By any chance do you know how to embed by dividing it into several times?

Diet · January 17, 2024, 4:38am

np

Yeah! instead of an array you can just pass the string to be embedded. Put that in a loop! don’t forget to add exponential backoff (double the wait time between retries with every subsequent failure) so you don’t hit the rate limiter too often.

Topic		Replies	Views
Embedding model token limit exceeding limit while using batch requests API embeddings , token , batching	8	25844	October 15, 2023
Embedding large number of sentences API	13	11900	December 25, 2023
Semantic embedding: super slow 'text-embedding-ada-002' API	12	8778	December 24, 2023
Embedding Longer Texts API	8	15748	December 25, 2023
RateLimitError: Error code: 429 - API chatgpt , api	3	3509	November 18, 2023

Embedding model type error

Related topics