Which database tools suit for storing embeddings generated by the Embedding endpoint?

So a few seconds, 10 seconds? What ballpark is it?

And this is for your 3.5k embeddings, right? Or was it 4.5M embeddings?

1 Like

This may be of interest:

They do offer some datasets for quick testing:

2 Likes

The link above kinda shows why I haven’t even bother to measure :wink:

Yes, 3.5K objects containing several text fields, each using embeddings of 1536 dimensions/vector

3 Likes

Wow, some impressive numbers!

I can see the downside, for folks like me, is that since I have sparse traffic, the hosting costs would eat me alive.

The tech is cool though. I have looked into FAISS as an algorithm, but the naive argmax works just fine for me (for now) :rofl:

But will have to keep Weaviate in mind for sure :+1: :100:

2 Likes

That’s why I’m using their cloud services…

1 Like

In my case, I can run 400,000 embeddings @ 1-2 seconds latency for less than $1 per month, assuming system is settled post-cold start and no elaborate database backups, with sparse traffic.

Here my major cost is backups, oddly enough.

High volumes of traffic might drive me to a Weaviate. At that point it might be close on cost, but I’m pretty sure I’d have to ditch argmax to get latency anywhere close to Weaviate on latency! :sweat_smile:

I would have to ditch multiplies and go with the Manhattan metric, and code it efficiently (probably vectorized on the entire batch of embeddings at once). That might give me a fighting chance :joy:

2 Likes