Discrepancy in embeddings precision

Using the Python library provided by OpenAI, embeddings requests return floats with up to 18 decimal places of precision. Using other methods (curl, hyper/reqwest) only returns floats with half the level of precision.

Anyone should be able to reproduce this by simply copy/pasting the example request provided in the documentation:

curl https://api.openai.com/v1/embeddings \
  -X POST \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"input": "The food was delicious and the waiter...",
       "model": "text-embedding-ada-002"}'


import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
  input="The food was delicious and the waiter..."

Unless there is something that I’m missing (quite possible), this would seem to be a pretty big problem for anyone interested in using or building a library outside of the one provided in Python.

This was discussed before.

The extra decimal points are essentially noise and can be discarded. If you’d like proof, try and perform some distance tests and you’ll see no difference.

The reasoning behind this (possibly incorrect as I can’t access my previous conversations) is that it’s sent as a float (7 decimal points) as default.

I believe there is a hidden parameter one can use to have it sent as a double

1 Like

Thanks. I hope that’s correct. I spent the better part of a week writing a Rust library only to notice the apparent discrepancy in testing the embeddings endpoint. I’ll try some distance tests as you suggest.

Again, I’m probably missing something. I’m quite sure it also had something to do with base64. There was a really nice write up and was eventually published on Github. I’ll reply once I find it.

In the meantime you can see the difference thru their library here. The answer is there.

Right, glancing through their source code to see if there was some missing parameter was one of the first things I tried and I did notice the base64 encoding. I tried adding the header “Accept-Encoding”, “base64”` to my Rust code to no effect.

I believe it’s to do this this segment:

# If a user specifies base64, we'll just return the encoded string.
# This is only for the default case.
if not user_provided_encoding_format:
    for data in response.data:

        # If an engine isn't using this optimization, don't do anything
        if type(data["embedding"]) == str:
            data["embedding"] = np.frombuffer(
                base64.b64decode(data["embedding"]), dtype="float32"

which is created here

user_provided_encoding_format = kwargs.get("encoding_format", None)

My memory is still a bit fuzzy but I believe you can actually include encoding_format in your request

I found it!


Using the API method I am getting, on average, 9 decimal places. This is more than sufficient to use since all the vectors are scaled to unit length from the embedding engine. In your ada-002 example, it has 1536 dimensions, so if you imagine a unit vector in this space with equal values, you get a vector of 1/sqrt(1536), which is 0.0255…, so 2 decimal places. Using higher dimension models like the old davinci embedding, this could get worse, but only by a factor of 10, so you are still good.

So, like stated earlier, for this model dimension, anything more than 6-7 decimal places isn’t carrying much information and it can actually be bad if you store your intermediate embeddings as strings in a database (so it cuts the DB size in half for the lower precision, which is better too).

1 Like