Semantic vs search embedding

Openai makes distinction between similarity and search embeddings saying that similarity embeddings are more suited to assess if 2 texts are similar while search embeddings are more suited to identify if a short text is closely related to a much longer text.

Which models from openai embeddings specialize in which function? For example, for which use case should text-embedding-ada-002 model be used for?

1 Like

Semantic embeddings are better for measuring the similarity between two texts, while search embeddings are better for finding long texts that are relevant to a short query.

Example:

  • Semantic embedding: What is the similarity between the sentences: ‘The cat sat on the mat’ and ‘The feline sat on the rug’?
  • Search embedding: Find all documents in the database that are relevant to the query: ‘What is the capital of France?’
2 Likes

What does text-embedding-ada-002 do well - Semantic or Search ?

It can handle both semantic and search tasks well