Are embeddings tied to a particular model?

I’m experimenting with embeddings and storing them in a vector DB. My thought is that when I ask a model for the embedding for a given input, that embedding is associated with that model and cannot be reliably used with a different model. Is that true? If so, then if I store embeddings generated with model A, do I need to update or regenerate them if I switch to model A’ or B? BTW, I asked ChatGPT this question and it said I would need to regenerate my embeddings, but thought I would double-check with actual humans! Thanks.

2 Likes

Certainly with the current state of the art, the embeddings created by a particular model cannot be shared with the embeddings of another, it is an active area of research so this may change, but right now… no.

1 Like

Embeddings by nature depend on the encoder with which they are created. With different models and different transformers having different techniques of encoding, there comes a mismatch in the decoder and thus the encoding created becomes garbage

2 Likes

Consider that at the most basic level, embeddings with different models have different rank tensors:

Ada (1024 dimensions),
Babbage (2048 dimensions),
Curie (4096 dimensions),
Davinci (12288 dimensions).
ada-V2 (1536 dimensions)

So if the question is if you can mix and match calls and still have any kind of sensical calculation, the answer would be no.

2 Likes