I’ve got a fine-tuned model that is trained to output one of two classes (“yes” or “no”). In a small minority of cases, the “text” field returned by the completion contains one class, e.g. “yes”, but when I look at the “top_logprobs”, that class is significantly higher (so lower probability) then the other.
For instance, there is a row where text is “yes”, and the first element of top_logprobs is {“no”:-0.0232, “yes”: -3.7727}. I’m wondering if this might have something to do with calculating probability with torch.sigmoid(logprob) vs. np.e**logprob (what I am doing right now), but I’m not sure?