Inconsistent embedding result with same input

Hi!

I called the same thing twice, give different result. Is it expected?

openai.Embedding.create(input=[“Why don’t scientists trust atoms? Because they make up everything!”], engine=“text-embedding-ada-002”).data[0][‘embedding’][:5]

[-0.0006829392514191568,
0.008169378153979778,
-0.01032500620931387,
-0.030606037005782127,
-0.010454473085701466]

openai.Embedding.create(input=[“Why don’t scientists trust atoms? Because they make up everything!”], engine=“text-embedding-ada-002”).data[0][‘embedding’][:5]

[-0.0007094526081345975,
0.00830783974379301,
-0.010237486101686954,
-0.03069303371012211,
-0.010490024462342262]

Have you tried using cosine similarity of the two to see how much of an actual difference it is?

I have just tried, it is close to ~0.9999 so it is good.
I am just coming from a system engineering perspective. Because now you cannot use the embedded value to map back to the original text.

Embeddings aren’t invertible in my experience. But you can hash the original text to get an index into a database.

1 Like