Some questions about text-embedding-ada-002’s embedding

OK, here is the solution. It is basically can be solved by post processing. Apparently this is a problem for trained embeddings out-of-the gate. The technical term for what ada-002 is is that it isn’t isotropic. One big vector is taking over the space and essentially reducing ada-002’s potential (dimensionality). Post-processing can improve this. Now, the paper shows the improvements are slight (2.3%), but it can be done.

2 Likes