What do all these models do?

there are a lot of models. what do all these models do?

id
ada
ada:2020-05-03
ada-code-search-code
ada-code-search-text
ada-search-document
ada-search-query
ada-similarity
babbage
babbage:2020-05-03
babbage-code-search-code
babbage-code-search-text
babbage-search-document
babbage-search-query
babbage-similarity
code-cushman-001
code-davinci-001
code-davinci-002
code-davinci-edit-001
code-search-ada-code-001
code-search-ada-text-001
code-search-babbage-code-001
code-search-babbage-text-001
curie
curie:2020-05-03
curie-instruct-beta
curie-search-document
curie-search-query
curie-similarity
curie-similarity-fast
cushman:2020-05-03
davinci
davinci:2020-05-03
davinci-if:3.0.0
davinci-instruct-beta
davinci-instruct-beta:2.0.0
davinci-search-document
davinci-search-query
davinci-similarity
if-curie-v2
if-davinci:3.0.0
if-davinci-v2
text-ada:001
text-ada-001
text-babbage:001
text-babbage-001
text-curie:001
text-curie-001
text-davinci:001
text-davinci-001
text-davinci-002
text-davinci-edit-001
text-davinci-insert-001
text-davinci-insert-002
text-search-ada-doc-001
text-search-ada-query-001
text-search-babbage-doc-001
text-search-babbage-query-001
text-search-curie-doc-001
text-search-curie-query-001
text-search-davinci-doc-001
text-search-davinci-query-001
text-similarity-ada-001
text-similarity-babbage-001
text-similarity-curie-001
text-similarity-davinci-001

3 Likes

Ada, Babbage, Curie, Cushman and Davinci are different models doing the same, but having different size. The model with a higher size tends to give better results, but has a higher price and takes more time to give result (because it requires more computation). Davinci is the model with the biggest size.

The models that have “code” in the name are part of “Codex” - they serve the purpose of generating code.

The models that have “text” in the name serve the purpose of generating plain text (that’s their primary purpose, but they can also generate code to a small extent).

The models that have “search” or “similarity” in name are for “embeddings” - they serve the purpose of finding similar texts (as described in the documentation under “embeddings”).

The models that have “search-code” in the name are for searching with “code”, “search-text” is for searching with text. The models that have “search” in the name, but not “code” are for searching text (“document” is for specyifing the documents among which you search, “query” is for the query by which you search). “similarity” is for finding similar documents as well, but there’s some difference between “search” and “similarity”. From what I’ve remember it’s mostly about the length of the searched documents.

The models with “edit” are for editing code or text (as opposed to completing it).

The models with “instruct” are the models trained specifically for being able to deal with the input (prompt) in a form or instructions.

The models with “insert” are for insertions (you pass [insert] in the prompt and it generates text/code in the middle of the prompt, in place where you put “insert”, instead of generating at the end).

“001”, “002” are different versions. I assume “002” are better than “001” becuase 002 is an improved version of 001.

I don’t know what the models with “if” are.

5 Likes

thank you and excellent! we will try them out as such!

1 Like

Ack. I found this incredibly helpful post after I wrote a page long post asking what the models do. This post should at the top of the Similarity Search / Embeddings doc pages! :scream:

3 Likes

what is the role of each model and price per 1000 token?
“whisper-1” : 0.000,
“babbage” : 0.00,
“text-davinci-003”: 0.00,
“davinci”: 0.00,
“text-davinci-edit-001”: 0.00,
“babbage-code-search-code”: 0.00,
“text-similarity-babbage-001”: 0.00,
“code-davinci-edit-001”: 0.00,
“text-davinci-001”: 0.00,
“gpt-4-0613”: 0.00,
“ada”: 0.00,
“babbage-code-search-text”: 0.00,
“babbage-similarity”: 0.00,
“gpt-4”: 0.00,
“gpt-3.5-turbo-0613”: 0.00,
“gpt-3.5-turbo-16k-0613”: 0.00,
“code-search-babbage-text-001”: 0.00,
“text-curie-001”: 0.00,
“gpt-3.5-turbo”: 0.00,
“gpt-3.5-turbo-16k”: 0.00,
“code-search-babbage-code-001”: 0.00,
“text-ada-001”: 0.00,
“text-similarity-ada-001”: 0.00,
“curie-instruct-beta”: 0.00,
“gpt-3.5-turbo-0301”: 0.00,
“ada-code-search-code”: 0.00,
“ada-similarity”: 0.00,
“code-search-ada-text-001”: 0.00,
“text-search-ada-query-001”: 0.00,
“davinci-search-document”: 0.00,
“ada-code-search-text”: 0.00,
“text-search-ada-doc-001”: 0.00,
“davinci-instruct-beta”: 0.00,
“text-similarity-curie-001”: 0.00,
“code-search-ada-code-001”: 0.00,
“ada-search-query”: 0.00,
“text-search-davinci-query-001”: 0.00,
“curie-search-query”: 0.00,
“davinci-search-query”: 0.00,
“babbage-search-document”: 0.00,
“ada-search-document”: 0.00,
“text-search-curie-query-001”: 0.00,
“gpt-4-0314”: 0.00,
“text-search-babbage-doc-001”: 0.00,
“curie-search-document”: 0.00,
“text-search-curie-doc-001”: 0.00,
“babbage-search-query”: 0.00,
“text-babbage-001”: 0.00,
“text-search-davinci-doc-001”: 0.00,
“text-search-babbage-query-001”: 0.00,
“curie-similarity”: 0.00,
“curie”: 0.00,
“text-embedding-ada-002”: 0.00,
“text-similarity-davinci-001”: 0.00,
“text-davinci-002”: 0.00,
“davinci-similarity”: 0.00,

The models can be sorted by their type and price classification by AI:

ada
ada-code-search-code
ada-similarity
ada-code-search-text
ada-search-query
ada-search-document
code-search-ada-text-001
code-search-ada-code-001
text-search-ada-query-001
text-search-ada-doc-001
text-similarity-ada-001
text-ada-001

babbage
babbage-code-search-code
babbage-similarity
babbage-code-search-text
babbage-search-document
babbage-search-query
text-search-babbage-doc-001
text-babbage-001
text-search-babbage-query-001

curie
curie-instruct-beta
curie-search-query
curie-search-document
curie-similarity
text-similarity-curie-001
text-search-curie-query-001
text-search-curie-doc-001

davinci
davinci-search-document
davinci-instruct-beta
davinci-search-query
text-search-davinci-query-001
text-search-davinci-doc-001
text-similarity-davinci-001
text-davinci-003
text-davinci-edit-001
text-davinci-001
text-davinci-002
davinci-similarity

gpt-4-0314
gpt-4-0613
gpt-4
gpt-3.5-turbo-0613
gpt-3.5-turbo-16k-0613
gpt-3.5-turbo-0301

The size and skill of the GPT-3 model goes from ada->babbage->curie->davinci, and so does the price, with a-c being rather simple models with a low number of parameters and dimensions in comparison to d.

The bare name version is an untuned GPT-3 completion engine, the model that a user can fine-tune, while others are pre-tuned for different tasks, some never leaving “beta”. All of these will be going away in 2024. text-davinci-003 is the only one comparable to ChatGPT.

GPT-3.5 and GPT-4 we know well, they are chat-trained models that use a different API endpoint. The ones with dates are a “snapshot” (that still seems to get modifications anyway). The name without a date is currently pointed to the latest.

text-embedding-ada-002 is a special case. It uses an embedding endpoint and returns vectors. Other specialized embedding models with “similarity” or “search” are deprecated in favor of this one.

Whisper is the speech-to-text AI.

Pricing: Pricing

Thank you for your reply. I was curious to know which are embedding model, chat model, InstructGPT, Image Model and so on. For example, davinci-search-document. What it does? embedding or something else. Thanks again for your prompt help.

https://platform.openai.com/docs/guides/embeddings/what-are-embeddings

First-generation embeddings are generated by five different model families tuned for three different tasks: text search, text similarity and code search. The search models come in pairs: one for short queries (-query) and one for long documents (-document).

xxx-instruct-beta are mostly just shortcuts to the new naming text-xxx-001. Models that can follow instructions, but differently than ChatGPT follows your instructions.

There’s plenty of documentation if you search, like even the second post in this topic answering your questions.