Dear Gents,
I am trying to use Openai Tutorial “https://platform.openai.com/docs/tutorials/web-qa-embeddings/building-a-question-answer-system-with-your-embeddings”; I did some changes to be able to use latest Openai methods.
I successful ran it on small website ; and gave very good answers.
but when I used it for large web site , the following python cell took more than 500 minutes without finishing
################################################################################
Step 10
################################################################################
Note that you may run into rate limit issues depending on how many files you try to embed
Please check out our rate limit guide to learn more on how to handle
this: https://platform.openai.com/docs/guides/rate-limits
df[‘embeddings’] = df.text.apply(
#lambda x: openai.Embedding.create(input=x, engine=‘text-embedding-ada-002’)[‘data’][0][‘embedding’])
lambda x: create_embedding(x, api_key=“my api key oooooooooooooooooo”))
df.to_csv( ‘processed/’ + local_domain + ‘/embeddings.csv’)
df.head()
the size of the scraped.csv file is 17.393M ; is this huge file for embedding…
the total number of tokens inside the data frame is 6,769,846… is this huge number of tokens.
I have $5.70 / $10.00 as balance ; sorry for sharing this info but to avoid anu surprises
any help please ; I am in bad need to finish this embedding task
Regards,
Omran