Introducing Embeddings

Hi all!

We’re rolling out Embeddings to all API users as part of a public beta.
Our Embeddings offering combines a new endpoint and set of models to address more advanced search, clustering, and classification tasks. The /embeddings endpoint returns a vector representation of the given input that can be easily consumed by machine learning models and algorithms.

We are releasing three sets of models:

  • Text Similarity: excels in capturing semantic similarity between pairs of text
  • Text Search: excels in finding relevant documents for a query among a collection of documents
  • Code Search: excels in finding relevant code blocks for a natural language query.

Please read our in-depth guide for examples on how to apply embeddings to different use cases, and refer to the API reference for more details on how to query the endpoint.

The /embeddings endpoint is offered for free through the end of 2021. If you have any questions, issues, or general feedback, or if you would like to use embeddings for academic or research purposes, please contact


Has the embeddings endpoint been pulled down? I started using it yesterday, and today, there is no hint of documentation or guide anywhere to be found on the site.

Usage stats from yesterday are also gone. The code I wrote still works, likely to stop soon?

1 Like

Well that ain’t great. I don’t believe we intended this to happen.

Thanks for the heads up! We’re looking into it.

The embedding endpoints should still be functioning so if you have running code it should still work.


I’ll be the first to say it: Text Similarity is my favorite part about GPT-3 <3


And we’re back! Sorry about that.


Confirmed. Thanks a lot!

That’s an amazing TAT. Kudos.

This is excellent. I switched over the search for a project I’m working on to these embeddings and it’s so much better. Even with just Ada. I was using Glove before.

One feature request would be to have access to these embedding layers for fine-tuned models too. If that’s even possible.


We’re hoping to support embeddings for fine-tuned models at some point in the future but don’t have an expected release date


This is awesome. What’s the meaning of the different indices?
i.e. response["data"][0] vs response["data"][1]
Maybe which tokens embeddings are used? i.e. 0 being the last, 1 one before last…
(couldnt find it in the guide)

Close! We only return embeddings for a input, considered as a whole. The length of the embeddings won’t change based on the length of the input passed in.

There should be one entry in the “data” array for each input you’ve submitted in the request. To figure out which embedding maps to which input, look at the “index” field for that entry.

You can see the index field in the response object near this doc: OpenAI API

Oh I see - Is sending tokens already supported?
Sending an array of token arrays as stated here OpenAI API

response = openai.Engine(id="babbage-similarity").embeddings(
    input=[['Sample', 'Ġdocument', 'Ġtext', 'Ġgoes', 'Ġhere']],
embeddings = response['data'][0]['embedding']

InvalidRequestError: [['Sample', 'Ġdocument', 'Ġtext', 'Ġgoes', 'Ġhere']] is not valid under any of the given schemas - 'input'

(using openai==0.11.3)

We support arrays of token arrays. Tokens are ints. In this case you’ve submitted an array of array of strings. For strings, we support a single string and an array of strings. The following might be what you want:

response = openai.Engine(id="babbage-similarity").embeddings(
    input=['Sample', 'Ġdocument', 'Ġtext', 'Ġgoes', 'Ġhere'],

Otherwise you’ll need to tokenize your input.

Great thanks - I was looking for the following :slight_smile:

tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
response = openai.Engine(id="babbage-similarity").embeddings(
    input=[tokenizer.encode("Sample document goes here", add_special_tokens=False)],
embeddings = response['data'][0]['embedding']

I figured it‘s the stochastic nature of the sampling

This is cool! I appreciate the Use Cases provided…seed has been planted in the field of possibility.

1 Like

What does “embedding” mean in openai context? Why this choice of naming for this? What is difference between “embedding” and “non-embedding”? Can you tell what would be the opposite of “embedding”? I don’t understand the name for this, what should I think under this term. Is it only a property/method of model or it something like reward/food for model?

Embedding is a term of trade in the AI/ML field.
In simple terms, an Embedding is a multi-dimensional numeric representation of a concept, that contains a distillation of all its semantics(properties).

For example, the redness, roundness, sweetness of an apple are 3 properties that can be numerically expressed and together will constitute a 3 dimensional embedding of apple. Think of it as a point (vector) in 3 d space. The sphere around it shall be the things that are close to apple.

Just expand this to many more dimensions, and consider that the individual dimension is not human interpretable (as in sweetness, redness etc. above), and you have your 1024/2048/4096 etc dimensional embeddings.



Excellent explanation.

The idea of “concept” being a synonym of “embedding” is genius, but only in this context. In general, it may not be so. As an example of the top of my head, in a collaborative filtering context for recommendations, the users’ embeddings may be synonymous with preferences. In a simple tf-idf model, the discovered embedding are (roughly) the normalized relative frequencies of terms.

However, given that the question was about the OpenAI GPT3 embeddings endpoint, I agree with your interpretation.


when i tried the snippet
def get_embedding(text, engine=“davinci-similarity”):

text = text.replace("\n", " ")

return openai.Engine(id=engine).embeddings(input = [text])[‘data’][0][‘embedding’]
get_embedding(‘Its a car’)

i got an error saying

/usr/local/lib/python3.7/dist-packages/openai/ in _interpret_response_line(self, rbody, rcode, rheaders, stream)
317 if stream_error or not 200 <= rcode < 300:
318 raise self.handle_error_response(
→ 319 rbody, rcode,, rheaders, stream_error=stream_error
320 )
321 return resp

InvalidRequestError: Engine not found

why this happening? the same is for babage-similarity, curie-similarity etc…
but im getting response from snippet
response = openai.Completion.create(engine=“davinci”, prompt=“This is a test”,max_tokens=5)
so it seems like having some issue with getting the embedding.
Thanks in advance

1 Like

Welcome to the OpenAI community @rramachandra93!!

Interesting, the code looks correct. A little more information is needed here.

What operating system are you using and what terminal/editor are you using? Did you copy & paste the code in directly?