How to reproduc benchmark results for new embedding v3 models ? ada vs v3 small & large models

as opeai mentioned performace of large model is still same even after reducing dimensions ? so wanted to to testing of it on benchmarks

By default, the length of the embedding vector will be 1536 for text-embedding-3-small or 3072 for text-embedding-3-large. You can reduce the dimensions of the embedding by passing in the dimensions parameter without the embedding losing its concept-representing properties.

tried this

from mteb import MTEB
from sentence_transformers import SentenceTransformer
embedding_model = "text-embedding-3-small"
embedding_encoding = "cl100k_base"
max_tokens = 8000  # the maximum for text-embedding-3-small is 8191

encoding_model = tiktoken.get_encoding(embedding_encoding)

evaluation = MTEB(tasks=["CQADupstackPhysicsRetrieval"])
results =, output_folder=f"results_openai/{encoding_model}")

im getting error:

    - CQADupstackPhysicsRetrieval, beir, s2p

38316/38316 [00:00<00:00, 55555.36it/s]
ERROR:mteb.evaluation.MTEB:Error while evaluating CQADupstackPhysicsRetrieval: Encoding.encode() got an unexpected keyword argument 'batch_size'
TypeError                                 Traceback (most recent call last)
<ipython-input-78-a01182201c3e> in <cell line: 13>()
     12 evaluation = MTEB(tasks=["CQADupstackPhysicsRetrieval"])
---> 13 results =, output_folder=f"results_openai/{encoding_model}")

5 frames
/usr/local/lib/python3.10/dist-packages/mteb/abstasks/ in encode_queries(self, queries, batch_size, **kwargs)
    116                     "Queries will not be truncated. This could lead to memory issues. In that case please lower the batch_size."
    117                 )
--> 118         return self.model.encode(queries, batch_size=batch_size, **kwargs)
    120     def encode_corpus(self, corpus: List[Dict[str, str]], batch_size: int, **kwargs):

TypeError: Encoding.encode() got an unexpected keyword argument 'batch_size'