Embeddings for the same content vary. How can this be fixed?

AaryaJake · May 24, 2024, 10:55am

We are currently using the text-embedding-3-small model to embed our documents. However, we have noticed that each time we create an embedding for the same document without changing any of the content, the resulting embedding varies.

This inconsistency is affecting our process of identifying the nearest embedding, as it results in different embeddings being picked each time.

anon22939549 · May 24, 2024, 3:07pm

By how much is it affecting the embeddings? What is the dot-product of the different embedding vectors for the identical text content?

wclayf · May 24, 2024, 5:15pm

If you’re checking for an exact match of one embedding vector to another that might be giving you a false sense that the embedding changed.

Like @anon22939549 said, the only way to compare embeddings is to check the actual vector similarity, not the byte values, or numbers of the vector itself. Even if the vectors look radically different they can point to the same location in semantic space.

xiao1 · August 9, 2025, 7:46am

Hi @wclayf , I am encountered the same issue. Do you mind explain more on why we can’t expect the same text content get exactly the same numbers in the embedding vector? Thank you so much!

vb · August 9, 2025, 8:39am

Without the intention to take away from @wclayf 's potential answer, here is a good take on the subject by @curt.kennedy

You can further read up on potential workarounds that will also improve the workflow.

Hope this helps!

_j · August 9, 2025, 10:13am

That non-deterministic behavior and rank flipping is now expected in the returned vectors.

Thing is though: there was no “best” in “nearest” in AI-powered semantic similarity. You’ll probably discover another model among dozens or hundreds that a human would evaluate better (but humans knowing an entire embedded corpus to judge is also a bit hard).

Typical is to use a “reranker” specialist embeddings on a top-k or top-budget initial exhaustive search result that is perhaps filtering to 10x greater than the chunk count, string length, or even input tokenization budgeted. Then you can plow the money in with populating upgraded full-dimension embeddings on demand, extending the tensors with more model calls for an averaging effect, or use of varied models with different learning. You can even ask a large input AI to pick 10 of 50 indexes against a query if they weren’t so biased in context input position.

Topic		Replies	Views
Are vectors generated by text-embedding-3-small always the same for the same text input? API embeddings	3	1455	May 8, 2024
Different embeddings for exact same text API embeddings	7	4279	December 18, 2023
Does openai Question embeddings change everytime? API api-embedding	1	417	October 7, 2024
Non-deterministic embedding models? API	1	2222	February 18, 2024
Inconsistent Embedding Results for my dataset API embeddings	1	149	November 14, 2024

Embeddings for the same content vary. How can this be fixed?

Related topics