Some questions about text-embedding-ada-002’s embedding

curt.kennedy · January 27, 2023, 2:51pm

OK, I coded up the algorithm and I’d say I got good results in preliminary testing. I now get cosine similarities that are positive, negative and zero across the embedding search space. The results seem to make sense too!

Only weird thing is that my max and min cosine similarities are ±0.1 instead of +/-1. I am only using the top 15 dimensions (D/100 for ada-002). Maybe this is reducing the energy somehow? Anyway, the relative correlations and anti-correlations seem to make sense. Plus my top correlations seem better than the original ones.

Will have to test more to see, but so far the algorithm in the paper seems to work!

Topic		Replies	Views
Question on text-embedding-ada-002 API	12	6184	December 24, 2023
Why `OpenAI Embedding` return different vectors for the same text input? API	35	8741	April 30, 2024
Can text-embedding-ada-002 be made deterministic? API embeddings , ada	18	6755	December 24, 2023
Embeddings and Cosine Similarity API	20	13481	February 25, 2024
It looks like 'text-embedding-3' embeddings are truncated/scaled versions from higher dim version API embeddings , tips-and-tricks	46	8880	May 26, 2024

Some questions about text-embedding-ada-002’s embedding

Related topics