Why has the software market not filled this vacuum?

We have a vibrant free market in software in general. And there is particularly a vibrant market for new AI software.

Given that -why is it that there is an almost near absence of out-of-the-box embedding software?

I have seen a few SAAS consultancies offering embedding services apparently at enterprise prices only quoted during a sales talk. Why doesn’t someone offer either local or cloud-based embedding software at reasonable pay-as-you-go prices similar to AI model API offerings?

1 Like

There is open source code which will perform embeddings, in case you did not know.

You might check it out in your are interested in this topic. I think there are also “in the cloud” tutorials out there on the net.

Google: word2vec

Yes I realize that. Open source is a great concept though almost always has a steeper hill to climb - which is why. for most software genres there are both paid and open-source offerings.

Why aren’t there paid offerings (other than at the enterprise level) for something that seemingly would be in huge demand currently?

Well, I personally do not think that embeddings are far superior to tradition search methods in the “biggest” search use cases, which (in my view only) is full-text DB searches; so maybe others do not see the “huge demand” as you mention?

Full-text DB searches are faster than using vectorized methods; and full-text DB searches are well established.

It seems to be we are simply at a high-point in a hype-cycle where many people are enamored with vectorized search based on OpenAIs current popularity and hype; and of course these embeddings are “not free” so why would you pay good money to vectorize entries in a DB and perform vectorized searches, etc when you get very good performance with full-text DB searches (and that is free)?

What am I. missing here @rkaplan ?

Do you think vectorized searches are far superior than full-text DB searches? Have you confirmed this in development?

I am a newbie in this regard so by all means correct me if I am wrong. But my understanding is that vectorized searches are accepted as the best way to do a semantic search. Can that be done in a full-text search without lots of additional manual semantic tagging?

Vectorized searches are great for NLP searches, no doubt.

Basic text substring matching and full-text DB searches powers almost all of the web-sites on the Internet. Is the Internet “falling apart” or “broken” from a lack of good search tools?

Simple question. Yes? No?

I think you would agree that the answer is “No”.

So, what is the “amazing use case” that every major DB vendor out there has missed over the past few decades? Vector-based searches are not actually “super cutting edge, SOTA”.

Take at look at this simple vectorize search for “Hello World” using the OpenAI API in this post here in our community:

The stated rationale/benefit for vector searches is that it allows a search for similarity, not just a search for an exact match.

While I will not claim to be an expert, the performance of Google vs. GPT-3 certainly is consistent with the theory. Whatever theoretical reason you ascribe to their different performance characteristics, the capability introduced by GPT-3 is profound.

You are mistaken about full text DB searching. These searches are not exact matches as you said above.

Full-text search refers to searching some text inside extensive text data stored electronically and returning results that contain some or all of the words from the query. In contrast, traditional search would return exact matches.


Anyway, I have provided a perspective on why embeddings have not taken the DB search-world by storm. Of course YMMV!

See also:

1 Like