How many human languages does text-embedding-ada-002 support?

SomebodySysop · August 16, 2023, 3:20am

Is there a list somewhere of the human languages supported by text-embedding-ada-002?

In this article, Revolutionizing Natural Language Processing: OpenAI’s ADA-002 Model Takes the Stage | by Jen Codes | Medium.

I found:

“It has been trained on a diverse set of languages, including English, Spanish, French, and Chinese, and has shown impressive results in tasks such as cross-lingual transfer learning.”

But, have so far been unable to find a list anywhere.

Foxalabs · August 16, 2023, 6:19am

I don’t think there is a definitive list, mainly because that would suppose there is some definitive list on the datasets it was trained on, it’s all of the languages in the training set, which would typically include most commonly spoken (online).

The entire field of NLP is new, we are the ones making the text books, the quick guides and the lists. This could be a great side project for someone to do, a linguistic embedding performance evaluation by language.

SomebodySysop · August 16, 2023, 10:40pm

I did find this, so I’m rolling with it for now: List of languages supported by ChatGPT | Botpress Blog

I have tested Spanish, Chinese and Korean and gotten good results. And, by good, I mean results you would expect from a machine translating prompts from English (or non-English) to English (for vectorizing) and then interpreting the resulting documents in order to translate them back to non-English language. That would be tough for an experienced translator, let alone a machine that has little sense of grammar, semantics, phraseology, idioms, etc…

raymonddavey · August 17, 2023, 6:28am

I can add Portuguese, French, Spanish, Italian, Mandarin, German. Have tested all of these and using them live with embedding

fruktoed · February 23, 2024, 6:44am

How about Russian and Ukrainian is it supported by text embedding model?

Topic		Replies	Views
Languages supported by text-embedding-3-large API embeddings , languages	5	7864	March 9, 2024
Embedding in a different language API	3	4499	December 14, 2023
ChatGPT embeddings for Bulgarian language? API	0	840	April 29, 2023
Languages supported by 'text-embedding-3-small' API embeddings , api , large-language-model , languages	0	575	July 31, 2024
Does ada support other languages than English? API embeddings , question	13	13238	October 18, 2023

How many human languages does text-embedding-ada-002 support?

Related topics