Is it possible to achieve embeddings cosine similarity approaching -1?

I made a post a while ago here:

This will make your embeddings more isotropic (more spread out) and get you more toward -1. But it requires post processing a large batch of previous embeddings.

But the what/how/why of what causing the bias in the embeddings get’s into potential biases in the hidden layers that are creating the embedding vectors.

So ultimately, the most practical solution is to adjust your thresholds for each embedding model you encounter, as they all seem pretty different, and do no conform to normal vector geometry expectations.

3 Likes