OpenAI Announces Text and Code Embeddings in the OpenAI API

OpenAI just announced the Text and Code Embeddings endpoint:

We are introducing embeddings, a new endpoint in the OpenAI API that makes it easy to perform natural language and code tasks like semantic search, clustering, topic modeling, and classification. Embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts. Our embeddings outperform top models in 3 standard benchmarks, including a 20% relative improvement in code search.

https://openai.com/blog/introducing-text-and-code-embeddings/

6 Likes

Just to tease OpenAI. An interesting thread regarding the efficacy of embeddings:

1 Like

Hi @m-a.schenk, can you point me to OpenAIā€™s response? I couldnā€™t find it on LinkedIn, and Iā€™m not on Twitterā€¦thx

1 Like

Thank you. Arvind took the high road - nice. Nils claims were just too silly. Not that GPT-3 is perfect or the best - I certainly donā€™t know - but I get annoyed when I see things like ā€œ1 million times more expensive.ā€ Thatā€™s why I donā€™t make time for twitter.

@lmccallum his criticisms are actually quite valid and well grounded, especially wrt cost trade-offs. The nuance here is that the standard benchmarks he mentions may not correlate well with real-world dataset performance - which is what openai embeddings seem to be optimized for. Regardless, every practitioner should have their own in-house benchmarks for making these judgments.

1 Like