Should we update to the new embeddings models?

thomas11 · February 10, 2024, 12:10pm

We’ve got an AI chatbot built using OpenAI, and we’re currently using text-embeddings-ada-002 as our embeddings model. A couple of days ago a much better embeddings model was released. The reasons why I was particularly interested was because among other things it reduces dimensions from 1,500+ to only 500 something.

For us reducing dimensions would be very valuable since we’re running our own SQLite based database adapter, using a VSS search functionality developed as a plugin, based upon FAISS. So reducing the number of dimensions from 1,500 to 500 would significantly reduce the memory footprint, allowing us to handle much more data, without adding more RAM.

However, is it really better, and/or equally good?

I would love to hear from somebody having practical experience with this …

Diet · February 10, 2024, 1:44pm

It’s unfortunately not super straight forward, but here are some threads tackling this issue

It looks like ‘text-embedding-3’ embeddings are truncated/scaled versions from higher dim version

New OpenAI Announcement! Updated API Models and no more lazy outputs

It really depends on what you’re embedding, how many documents you have, etc, etc, and even then your mileage will vary

thomas11 · February 10, 2024, 2:08pm

Thank you for great answer, although it didn’t really answer me. I see this part “So my preliminary conclusion is that when you can afford it performance-wise, using all the available dimensions is still very favorable” which I guess is concluding no.

Our industry is AI chatbots for customer support, and sales navigation/suggestions. Most of our clients have 500 or less records, some few up to 10,000 and 20,000, but these are the edge case. We still need to support them though.

We typically extract multiple records as we’re creating context we’re sending to OpenAI. I’d love to have an answer in the span of; “No, don’t change” or “Yes, change, the quality is still (almost) the same”.

But thx for the answer

Diet · February 10, 2024, 2:26pm

You’re running into ram issues with 500 records? That’s like 10 megabytes give or take with 2000 dims.

Maybe consider loading the index on demand? shouldn’t take a split second.

Ok, you didn’t ask for this advice

I’d personally say no, reducing dimensions is generally probably not worth the headache.

thomas11 · February 10, 2024, 2:54pm

You’re running into ram issues with 500 records? That’s like 10 megabytes give or take with 2000 dims

Nope, but we’re running into RAM issues once we start seeing 10,000+ records. We’re deploying into Kubernetes, and we’re trying to be as conservative as possible with resources - So the default deployment is using a 400MB RAM. I’d love to be able to use that amount of RAM for 30,000+ records, but this is not possible now …

Without training, and/or smaller embeddings, this is not possible unfortunately. If the new 500 dim embeddings model gives us 1 to 3 percent quality loss, I wouldn’t mind that much - But if it fundamentally changes (to the worse) the lookups into our DB, that would be a big no-no …

Notice, we’re using SQLite, so the application itself is sharing RAM with the database …

Single deployment/single POD, single process deployment, containing “everything” …

Topic		Replies	Views
Embeddings performance difference between small vs large at 1536 dimensions? API embeddings , vector-db	11	12548	April 13, 2024
Transitioning to the new embeddings models from ada API embeddings	8	5806	January 27, 2024
Are OpenAI text-embedding-ada-002 embedding model greater than text-embedding-3-large? Community embeddings , chatgpt , api	1	1792	February 21, 2024
Quality of embeddings using davinci-001 embeddings model vs. ada-002 model API embeddings	15	4128	April 9, 2024
Better performance using text-embedding-3-large? API embeddings	6	6248	February 11, 2025

Should we update to the new embeddings models?

Related topics