Are vectors generated by text-embedding-3-small always the same for the same text input?

suhas.chatekar · May 8, 2024, 12:19pm

I am in the process of building a prototype for continuous ingestion of content into a vector database. I am using text-embedding-3-small model to generate vectors before storing a Azure Search index.

The internal API I am using to get the newly created content is not perfect which means I might be fetching content that is already stored in the index. I am thinking this should not be a problem if the model produces the same vector representation then when I send that vector to Azure Search index, it would simply be replaced in place of the current vector in the index. So my question is - will the model generate same vector for the same input text on second and subsequent calls to the model API?

_j · May 8, 2024, 12:53pm

The embeddings models should not be used to verify identical contents or avoid duplication - you can use a hash algorithm for that for free.

Results are close enough between successive runs that it would be effective to almost always return the same top results, and embeddings quality is kind of subjective anyway..

suhas.chatekar · May 8, 2024, 1:52pm

Thanks for that response. Very useful.

I am aware of hashing techniques and aware that I can employ them to make sure I am not vectoring the same content multiple times. I was wondering if there is a way to avoid having to hash the content separately.

_j · May 8, 2024, 3:53pm

The only way I see would be more expensive and slower. Which is to not add what you paid embeddings for if there is an embeddings result >.999 from an exhaustive search and the text returned from the database matches.

Topic		Replies	Views
Embeddings for the same content vary. How can this be fixed? API embeddings	5	1287	August 9, 2025
Different embeddings for exact same text API embeddings	6	4389	October 5, 2023
Does openai Question embeddings change everytime? API api-embedding	1	462	October 7, 2024
Non-deterministic embedding models? API	1	2341	February 18, 2024
Splitting text into chunks versus reducing the text API embeddings , ada	9	3249	April 5, 2024

Are vectors generated by text-embedding-3-small always the same for the same text input?

Related topics