-
Suppose I run GPT3.5 (or GPT4) on two different pieces of text (prompt).
Say a token is included in both pieces of text.
Will the token be converted into the same embedding in the two executions? -
Suppose I run GPT3.5 (or GPT4) on the same pieces of text (prompt) twice with temperature = 1.
Will the embedding of the token be the same in the two executions? -
How about ADA-002?
Will ADA-002 always generate the same embedding for the same input even if temperature = 1?
The embeddings vector values of the modern ada embeddings model vary between identical calls. You can make two, or ten, or hundreds as I’ve done in analysis to see that 4 in 5 are different (there are a few that occasionally match).
It is likely based on the same architecture as gpt-3.5 - also with a significant variety in the logit values now shown by the new instruct model.