Introducing Embeddings

rramachandra93 · December 17, 2021, 10:47am

Hi
Thanks for ur reply. I has copied the code directly from documentation. I m using colab in google chrome with win10 home edition OS. Before running the above code i also set the api key from my profile. so it doesnt look like an authetication problem. Also i got the response from compilation.create query, the problem is only when trying to get the embeddings.

DutytoDevelop · December 17, 2021, 10:59am

Thanks for the quick response and providing that information!

The reason why I asked those questions is because when you copy and paste code from the documentation, there can be an encoding issue specifically with Windows since Windows default encoding is CP1252 (See here for more detail) and so the right-quote character specifically does not map to anything and thus issues arise during execution. It may be possible that this is what is happening here, but I’m not sure and I don’t have access to those models yet to test it myself.

Some IDEs can override the default encoding however, but it’s not 100%. It looks like Google Colab’s default encoding is UTF-8 which does correctly interpret that problematic right-quote character, but it can be overridden in some cases.

Can you try typing everything to ensure that’s possibly not what’s going on here and let us know what happens?

boris · December 17, 2021, 11:39am

Your code is correct. We temporarily rolled back default unlimited access to everyone. We’ll message on this soon, and will work on quickly making it available to everyone again.

rramachandra93 · December 17, 2021, 12:04pm

Hi
yeah i tried by entirely typing the code to new cell including api key, but the problem persists. Do it need to download the model before using, like in the case of sentence transformers? Im not sure. But the same error raising again after typing the code.

rramachandra93 · December 17, 2021, 12:05pm

hi
No, actually i sign up and started using by today itself

rramachandra93 · December 17, 2021, 12:09pm

Hi
So but we can get the response for the completion task by model davinci. So you are rolled back the access for several services and several models only? Kindly reply

DutytoDevelop · December 17, 2021, 1:40pm

Completions with the Davinci model is active and working, yes. It looks like the Embeddings endpoint has been taken offline for the time being. The new similarity / search models appear to have been taken offline as well, since I did see them available last week.

However, I’m not 100% certain if private beta access provided before the release of these new models also covers the private beta access scoped for the newer models as well. Since the Codex and the newer models are both labelled as needing private beta access though, I am to believe they fall under the same scope, but I do not know for sure.

gwern · December 17, 2021, 4:33pm

Has anyone successfully used the davinci-similarity embeddings and/or compared them with the ada/babbage/curie embeddings?

I recently used the endpoint to set up ‘similar links’ for Gwern.net link annotations (as a kind of automated ‘see also’), doing vector search on abstract embeddings. My Haskell vector search library is slow, buggy, and hard to use, so I plan to switch to a real vector search library at some point like FAISS, but nevertheless, it was returning decent results with ada/curie using (what I think is equivalent to cosine similarity for this usecase) L2-normalized Euclidean distance (I didn’t try babbage), so I upgraded to davinci-similarity and the results suddenly turned into garbage - the similar links were no longer even slightly similar and looked picked at random. (They also turned into garbage with all the engines when I tried to keep all of the semantically-relevant newlines, which is very strange but the docs do warn you that will happen and to strip all \ns, so I was annoyed but not surprised.) The vector search library claimed it was still getting high recall and finding the nearest-neighbors like it should, so I couldn’t assume that it was buggy again.

I notice all of the embedding Guide examples avoid using the davinci engine, and I don’t see anyone here posting about their success with davinci-similarity or how much better it is than ada/babbage/curie.

So I’m curious: davinci-similarity users, where are you? Does it work for you? How much better is it? Is there some trick or special distance function you needed to use?

hallacy · December 17, 2021, 5:16pm

Hi all,

Thanks to everyone who has tried embeddings to date! Your feedback has been invaluable.

We are working on some updates and are pausing new user access for now.

If you are interested in using embeddings in the future, please fill out this form.

If you have any questions or if you would like to use embeddings for academic or research purposes, please contact embeddings@openai.com

rramachandra93 · December 20, 2021, 4:58am

HI
Thanks for your replay.yeah, may be endpoint taken offline for the time being. Not sure.
regards

gpriday · December 20, 2021, 10:32am

I’m still testing this out, but these embeddings should pair really well with Google’s new Matching Engine.

It’s a hosted Approximate Nearest Neighbour search that scales to billions of vectors with search times of ~10ms.

Their service has a Two Tower implementation too, but being able to create a fine-tuned Text Search embedding model right inside OpenAI would be the dream. I’m testing this out using Curie embeddings as a starting point.

karl1 · March 11, 2022, 10:33pm

Demanding kids, no?! I tell you, once you give them a scoop — The dopamine kicks in! Cannot be unseen. You’re hooked on that generational autoregressive cult of Transformers.

Ah, spring time, you do know what they say about being a parent.

It’s the ‘best experience in the whole wild world’.

Topic		Replies	Views
Why are similarity scores lower with text-embedding-3-small? API embedding	5	338	April 2, 2026
Question on text-embedding-ada-002 API	11	6639	July 27, 2023
Semantic search using uploaded files (only performs lexical search for me) API	18	2824	December 11, 2022
Quality of embeddings using davinci-001 embeddings model vs. ada-002 model API embeddings	14	4599	December 11, 2023
Reducing Cost of GPT 4 by using embeddings Prompting	23	11092	May 4, 2023

Introducing Embeddings

Related topics