Embeddings for the same content vary. How can this be fixed?

We are currently using the text-embedding-3-small model to embed our documents. However, we have noticed that each time we create an embedding for the same document without changing any of the content, the resulting embedding varies.

This inconsistency is affecting our process of identifying the nearest embedding, as it results in different embeddings being picked each time.

By how much is it affecting the embeddings? What is the dot-product of the different embedding vectors for the identical text content?

If you’re checking for an exact match of one embedding vector to another that might be giving you a false sense that the embedding changed.

Like @elmstedt said, the only way to compare embeddings is to check the actual vector similarity, not the byte values, or numbers of the vector itself. Even if the vectors look radically different they can point to the same location in semantic space.