I’m seeking out advice from the community for any options they might be aware of for the generation of embeddings without the need to call a cloud service. This is for Vectra, my local Vector DB project and is related to a question I got from a user. It looks like TensorFlow might be an option but I’m wondering if there are other options and if anyone in the community can comment on the quality of the Tensor Flow embeddings in the context of semantic search, compared to either OpenAI of HF’s embeddings?
Offline embeddings are not only interesting as a cost savings measure but also in the context of search over private data where you don’t want any data leakage to an external cloud.
Considering the current cost of embeddings, I think the depreciation of the hardware you ran local ones on would be greater than the cost incurred by using the API.
You might want to start with one of the many pretrained models e.g. «all-MiniLM-L6-v2» that is lightweight (just 80MB) and fast and yields good results.