Inconsistent embedding result with same input

beastse · March 25, 2023, 4:11pm

Hi!

I called the same thing twice, give different result. Is it expected?

openai.Embedding.create(input=[“Why don’t scientists trust atoms? Because they make up everything!”], engine=“text-embedding-ada-002”).data[0][‘embedding’][:5]

[-0.0006829392514191568,
0.008169378153979778,
-0.01032500620931387,
-0.030606037005782127,
-0.010454473085701466]

openai.Embedding.create(input=[“Why don’t scientists trust atoms? Because they make up everything!”], engine=“text-embedding-ada-002”).data[0][‘embedding’][:5]

[-0.0007094526081345975,
0.00830783974379301,
-0.010237486101686954,
-0.03069303371012211,
-0.010490024462342262]

RonaldGRuckus · March 25, 2023, 4:34pm

Have you tried using cosine similarity of the two to see how much of an actual difference it is?

beastse · March 26, 2023, 2:53am

I have just tried, it is close to ~0.9999 so it is good.
I am just coming from a system engineering perspective. Because now you cannot use the embedded value to map back to the original text.

curt.kennedy · March 26, 2023, 3:52pm

Embeddings aren’t invertible in my experience. But you can hash the original text to get an index into a database.

Topic		Replies	Views
Embedding creator indeterminism API	0	306	October 19, 2022
Non-deterministic embedding results using text-embedding-ada-002 API	7	4049	December 24, 2023
Different embeddings for exact same text API embeddings	7	1769	December 18, 2023
Non-deterministic embedding models? API	1	312	February 18, 2024
Azure OpenAI Embeddings vs OpenAI Embeddings API	5	2544	December 25, 2023

Inconsistent embedding result with same input

Related Topics