Open AI embedding with 3072 dimensions

it’s probably just because they use some variation of cross entropy for training, maybe with a scaling factor, no?

I can’t think of a good simple loss function that allows for negatives in this context