In short: text-embedding-ada-002
does not work well for search as the previous models, but similarity is more capable than previous models.
There must be differences between similarity search and semantic search. Older models, (those with query and documents), have their own use cases. They also have more knowledge about different products, events, and terms. I have no idea why OpenAI wants to remove these models. To me, language is complex, and I need different models to deal with this complexity.
The results of evaluations may speak well to similarities, but in real-world applications, they might not work as expected. For instance, when people make searches, they typically query the subject matter of a document, not the entire document.
The text-embedding-ada-002
model works well for clustering and similarities, such as paraphrase detection. In my opinion, it’s incorrect to compare it with the text-search-davinci-query/doc series. OpenAI never asked for feedback on text-embedding-ada-002
.