Why does my completion sometimes return the token with the higher log-loss of the two?

I’ve got a fine-tuned model that is trained to output one of two classes (“yes” or “no”). In a small minority of cases, the “text” field returned by the completion contains one class, e.g. “yes”, but when I look at the “top_logprobs”, that class is significantly higher (so lower probability) then the other.

For instance, there is a row where text is “yes”, and the first element of top_logprobs is {“no”:-0.0232, “yes”: -3.7727}. I’m wondering if this might have something to do with calculating probability with torch.sigmoid(logprob) vs. np.e**logprob (what I am doing right now), but I’m not sure?

I’m thinking the issue might be that GPT is using a sigmoid function on the logloss values and then checking if they’re over 0.5 (rather than using e^logloss or a softmax and then taking the max probability). The cases it goes wrong are where the sigmoid resulting in a larger probability is less then 0.5… I can’t find anything in the reference about making completions use e^logloss or softmax – would appreciate any help there.