It looks like 'text-embedding-3' embeddings are truncated/scaled versions from higher dim version

The base64 embeddings API return contains the exact 4 bytes of 32 bit floats.

A peculiar but expected thing happens when you request alternate dimensions: the maximum values are smaller with more dimensions.

So it looks like one would want to store a scale factor along with the tensor, for ideal quality and a multiplication by a fraction operation upon unpacking from RAM.

8 bit results

Here’s FP8 e4m3b11fnuz - Exponent: 4, Mantissa: 3, bias: 11 (Extended range: no inf, NaN represented by 0b1000’0000). 3-large d:1536 as before, scaled by x16.

== Cosine similarity comparisons ==

- 0:"What is an OpenAI GPT useful for?" <==> 1:" [1] Documentation does not superced" -
 float32: 0.427506
 float08: 0.427878
- 0:"What is an OpenAI GPT useful for?" <==> 2:" [2] Both of our new embeddings mode" -
 float32: 0.293851
 float08: 0.293858
- 0:"What is an OpenAI GPT useful for?" <==> 3:" [3] We’re rolling out custom versio" -
 float32: 0.612172
 float08: 0.612752
more semanic similarity 32 vs 8
- 0:"What is an OpenAI GPT useful for?" <==> 4:" [4] GPTs let you customize ChatGPT " -
 float32: 0.480640
 float08: 0.480849
- 0:"What is an OpenAI GPT useful for?" <==> 5:" [5] The GPT Store is rolling out la" -
 float32: 0.465091
 float08: 0.464570
- 0:"What is an OpenAI GPT useful for?" <==> 6:" [6] We’ve set up new systems to hel" -
 float32: 0.501608
 float08: 0.500995
- 0:"What is an OpenAI GPT useful for?" <==> 7:" [7] Developers can connect GPTs to " -
 float32: 0.435705
 float08: 0.435533
- 0:"What is an OpenAI GPT useful for?" <==> 8:" [8] Since we launched ChatGPT Enter" -
 float32: 0.520939
 float08: 0.520229
- 0:"What is an OpenAI GPT useful for?" <==> 9:" [9] We want more people to shape ho" -
 float32: 0.558575
 float08: 0.558209
- 0:"What is an OpenAI GPT useful for?" <==> 10:" [10] Creating a GPTHow to create a " -
 float32: 0.527425
 float08: 0.526704
- 0:"What is an OpenAI GPT useful for?" <==> 11:" [11] Here’s how to create a GPT:  H" -
 float32: 0.517636
 float08: 0.517361
- 0:"What is an OpenAI GPT useful for?" <==> 12:" [12] Advanced SettingsIn the GPT Ed" -
 float32: 0.278446
 float08: 0.277858
- 0:"What is an OpenAI GPT useful for?" <==> 13:" [13] Settings in the Configure tab:" -
 float32: 0.436078
 float08: 0.434184
- 0:"What is an OpenAI GPT useful for?" <==> 14:" [14] FAQ: Q: How many files can I u" -
 float32: 0.218532
 float08: 0.217219
embeddings values sample - 32 vs 8

0 [‘+0.02560364’, ‘-0.01831481’, ‘-0.01141823’, ‘-0.02364949’]
0 [‘+0.02539062’, ‘-0.01757812’, ‘-0.01171875’, ‘-0.02343750’]
1 [‘+0.01583907’, ‘-0.01921137’, ‘-0.01613504’, ‘-0.01042185’]
1 [‘+0.01562500’, ‘-0.01953125’, ‘-0.01562500’, ‘-0.01074219’]
2 [‘+0.04101363’, ‘-0.03312357’, ‘-0.00581640’, ‘-0.00354271’]
2 [‘+0.03906250’, ‘-0.03125000’, ‘-0.00585938’, ‘-0.00366211’]
3 [‘+0.02819351’, ‘+0.00425646’, ‘-0.02809555’, ‘-0.01454738’]
3 [‘+0.02734375’, ‘+0.00439453’, ‘-0.02734375’, ‘-0.01464844’]
4 [‘+0.02966771’, ‘+0.00471444’, ‘-0.01965263’, ‘-0.01287654’]
4 [‘+0.02929688’, ‘+0.00488281’, ‘-0.01953125’, ‘-0.01269531’]
5 [‘+0.02289144’, ‘+0.01062817’, ‘-0.02655740’, ‘-0.01890262’]
5 [‘+0.02343750’, ‘+0.01074219’, ‘-0.02734375’, ‘-0.01953125’]
6 [‘+0.04714031’, ‘+0.00739293’, ‘+0.00643247’, ‘-0.00565372’]
6 [‘+0.04687500’, ‘+0.00732422’, ‘+0.00634766’, ‘-0.00585938’]
7 [‘+0.04821001’, ‘-0.00359391’, ‘-0.01465362’, ‘-0.01400830’]
7 [‘+0.04687500’, ‘-0.00366211’, ‘-0.01464844’, ‘-0.01367188’]