Are OpenAI text-embedding-ada-002 embedding model greater than text-embedding-3-large?

nihidu · February 21, 2024, 3:48am

Hey! I’m currently working on a RAG system using OpenAI’s text-embedding-ada-002 model. Initially, it provided excellent answers by extracting the right preprocessed chunks when users responded to questions. However, after migrating the embedding model to OpenAI’s text-embedding-3-large, which has 1536 dimensions, my RAG system didn’t perform as well as before. Any insights or suggestions would be greatly appreciated!

_j · February 21, 2024, 4:08am

“Migrating” refers to re-embedding your entire search corpus on the new model at a specific choice of dimensions. The output is not backwards-compatible.

3-large can have its dimensionality reduced by truncation to 1536 through an API parameter, but its native dimensionality is 3072. If you are paying, you might as well utilize the full quality until you encounter RAM or computation time constraints.

Thresholds will also need to be adjusted. Previously, a relevance score above 0.80 might have been a good cutoff point. Now, a score of 0.50 would be an appropriate scaling to accommodate the dot products returned by comparisons.

Topic		Replies	Views
Embeddings performance difference between small vs large at 1536 dimensions? API embeddings , vector-db	11	13573	April 13, 2024
Better performance using text-embedding-3-large? API embeddings	6	6471	February 11, 2025
Transitioning to the new embeddings models from ada API embeddings	8	5927	January 27, 2024
Query embedding threshold evaluation with curbing dimension API embeddings	1	202	August 5, 2024
Reduced Cosine of Similarity relevance scores with "text-embedding-3-small" Vs. "text-embedding-ada-002" API embeddings	2	1236	July 19, 2024

Are OpenAI text-embedding-ada-002 embedding model greater than text-embedding-3-large?

Related topics