Search vs Similarity

So I’ve got a bunch of projects, and I’ve got a bunch of users (or I would, were this not all hypothetical). I want to suggest users for projects based on the project description, and the user’s skills & bio.

Is it better to use search embeddings or similarity embeddings?

Is there any difference in comparing the embeddings when doing similarity vs search? I’ve successfully implemented search using the doc/query embeddings. Does similarity work more or less the same way?

The way I understand it is that similarity search is for smaller chunks. That said, Google’s sentence encoder, and pinecone are supposed to be better cheaper and faster

Hi Mike, thanks for the question.

Search embeddings are generally best for matching very short pieces of text to long pieces of text, such as few word queries to project descriptions. Based on what you said, I suspect similarity embeddings would work best.

To improve your embeddings further, you can customize them by learning a translation matrix, as done in this notebook (you need at least a hundred examples of what you find similar (or dissimilar), in your case cases where users have worked on projects previously for example):