We’ve got an AI chatbot built using OpenAI, and we’re currently using text-embeddings-ada-002 as our embeddings model. A couple of days ago a much better embeddings model was released. The reasons why I was particularly interested was because among other things it reduces dimensions from 1,500+ to only 500 something.
For us reducing dimensions would be very valuable since we’re running our own SQLite based database adapter, using a VSS search functionality developed as a plugin, based upon FAISS. So reducing the number of dimensions from 1,500 to 500 would significantly reduce the memory footprint, allowing us to handle much more data, without adding more RAM.
However, is it really better, and/or equally good?
I would love to hear from somebody having practical experience with this …
It’s unfortunately not super straight forward, but here are some threads tackling this issue
It looks like ‘text-embedding-3’ embeddings are truncated/scaled versions from higher dim version
New OpenAI Announcement! Updated API Models and no more lazy outputs
It really depends on what you’re embedding, how many documents you have, etc, etc, and even then your mileage will vary
Thank you for great answer, although it didn’t really answer me. I see this part “So my preliminary conclusion is that when you can afford it performance-wise, using all the available dimensions is still very favorable” which I guess is concluding no.
Our industry is AI chatbots for customer support, and sales navigation/suggestions. Most of our clients have 500 or less records, some few up to 10,000 and 20,000, but these are the edge case. We still need to support them though.
We typically extract multiple records as we’re creating context we’re sending to OpenAI. I’d love to have an answer in the span of; “No, don’t change” or “Yes, change, the quality is still (almost) the same”.
But thx for the answer
You’re running into ram issues with 500 records? That’s like 10 megabytes give or take with 2000 dims.
Maybe consider loading the index on demand? shouldn’t take a split second.
Ok, you didn’t ask for this advice
I’d personally say no, reducing dimensions is generally probably not worth the headache.
You’re running into ram issues with 500 records? That’s like 10 megabytes give or take with 2000 dims
Nope, but we’re running into RAM issues once we start seeing 10,000+ records. We’re deploying into Kubernetes, and we’re trying to be as conservative as possible with resources - So the default deployment is using a 400MB RAM. I’d love to be able to use that amount of RAM for 30,000+ records, but this is not possible now …
Without training, and/or smaller embeddings, this is not possible unfortunately. If the new 500 dim embeddings model gives us 1 to 3 percent quality loss, I wouldn’t mind that much - But if it fundamentally changes (to the worse) the lookups into our DB, that would be a big no-no …
Notice, we’re using SQLite, so the application itself is sharing RAM with the database …
Single deployment/single POD, single process deployment, containing “everything” …