Thank you for your response, @raymonddavey.
I assume you want the probability score for the positive/negative/neutral result and not the text you are asking it to classify?
Yes. Given inputted texts x (the text I wanna classify, including some prompting language) and y (one of the labels that I’ve hand-crafted), I’d like GPT-3’s estimate of Pr(y | x). Though, since this problem is a classification one, Pr(x, y) and even argmax Pr(y | x), or equivalently argmax Pr(x, y), work too.
If it is, a logprobs of 3 will give you all 3 values
For simple prompts, y’s tokens may indeed be likely enough to be included among the top logprobs. But nothing explicitly prevents y’s tokens from being outside of the top logprobs (or never sampled), and thus completely missing in the Completion endpoint’s response.
Here’s an example of a harder classification problem:
import os
import openai
openai.api_key = os.getenv('OPENAI_API_KEY')
prompt = '''
A movie review can only belong to one of these categories: "Just another superhero movie" or "Generic hype".
Which category does this movie review belong to?
"""
A thrill ride for the ages! --Peter K. Rosenthal
"""
'''
response = openai.Completion.create(model='text-davinci-003',
prompt=prompt,
max_tokens=20,
temperature=0)
print(response['choices'][0]['text'])
# prints: This movie review does not belong belong to either of the categories.
The correct label is 'Generic hype' of course. And while it’s nice to see GPT-3 conveying uncertainty, it might’ve been the case that Pr('Generic hype' | movie review, prompt) > Pr('Just another superhero movie' | movie review, prompt), even though they’re both low. So the method proposed in this question would result in a correct prediction, instead of the uncertain one that the completion method gave.
We can go down the prompt engineering rabbit hole to increase the chance that the completion endpoint either predicts a class in the label set, or includes y in its logprobs. But that’s neither simple nor completely effective. Estimating Pr(y | x) is both of these things.
If it is not structured this way, can you restructure your training so the prompt is everything except the {sentiment} (with a space before it)?
To clarify, there’s no training necessary in this method. Though I know what you mean, and indeed that’s the standard completion approach to solve classification problems. But the method described in my question is much simpler. There’s no sampling; it just computes what GPT-3 has already modeled.
Another approach is to look at the “classification” part of “embedding”. From memory it also gives log values.
I assume you’re referring to training a classifier using embeddings as features. That approach indeed works. But I’d like to avoid going through embeddings just to loosely estimate something which is ideally immediately available for an autoregressive language model.