Is it possible to write a tool to build vectors (embedding) yourself?

Hello!

We are creating a Q&A service that uses embedding for local search and for Turbo 3.5 answer generation

We need advice on what tool to use to build vectors without using OpenAI.

Is it possible to write a tool to build vectors (embedding) yourself.

What is better to use to build vectors: full-text context search or text search by vectors (embedding)

I do not think you can easily find a “better” embeddings generator than the OpenAI API.

However, there are some “OK” implementations out there like word2vec.

Have you tested it?

:slight_smile:

Thanks for the tip, we will try to implement it in word2vec.

Word2Vec is one of the better options out there

There are also versions that handle sentences.

You can write your own transformers based embedding engine if you really wanted to (or use their pre-trained models) … use SBERT for example:

https://www.sbert.net/index.html

This is more at the sentence level. But look for “BERT” based systems, like SBERT, BERT, RoBERTa, etc.

Is there any reason why you’d like to do this?

You may like this HF leaderboard of embeddings (includes Ada)

1 Like

Probably not, but the OP was wondering if you could do it yourself. The answer, of course is “yes”, but if performance is taken into consideration, then @RonaldGRuckus I would say “no”. :upside_down_face:

1 Like