Embeddings performance difference between small vs large at 1536 dimensions?

Yes, and to expand on my contribution to that topic…

Here’s another thought:

If reduced dimensions were remapped so that dimensions most relevant for obtaining semantic distinguishment were placed first to tolerate truncation, trained by extensive trials and then sorting the output order, how would that be done?

By targeting against a benchmark.

One might postulate then, in making a truncatable embeddings model, that benchmarks such as MTEB and others might have been used for discovery of dimensions with highest applicability to known tasks.

Thus, those reduced embeddings dimensions by parameter specification may be more performative against benchmarks than in general or novel use.

The challenge then is in coming up with “unseen” cases to qualify different 1536-dimension output, available from all API models. To then find out if half of 3-large takes a larger hit than a single metric shows. Find out how poor the second-half is…