High-level problem
Say my input text
gets tokenized into [t1, t2, t3, t4]
. Can I use the API to estimate the log-probability of text
as
log Pr(t1, t2, t3, t4
) ?
If this estimation isn’t already implemented in the API, is there a way to estimate this type of conditional probability:
log Pr(t3, t4
| t1, t2
) = log Pr(t3
| t1, t2
) + log Pr(t4
| t1, t2, t3
) ?
In this conditional probability, I’m the one providing the hypothesized completion text [t3, t4]
, not GPT-3. Setting the logprobs
argument isn’t sufficient b/c (1) GPT-3 may not happen to sample t3
or t4
or (2) the response may exclude t3
or t4
if they aren’t among the top logprobs
tokens (at their positions). That’s why the Completion endpoint doesn’t seem sufficient for my problem.
Motivation
I’d like to evaluate a simple approach to language classification tasks, specifically where labels are textually meaningful. For example, in a sentiment classification task, an alternative to text completion is to estimate the probability of an inputted, completed text—
The sentiment of this tweet
"""
I loved the new Batman movie!
"""
is {sentiment}.
—where {sentiment}
is replaced w/ positive
, negative
, or neutral
, and then return the sentiment which gave the highest log-probability.
Has this method been evaluated anywhere? (IIUC this approach is quite different than the deprecated Classification endpoint.)
I suppose OpenAI won’t release probabilities too liberally b/c a user could train on them and compete w/ GPT-3 or something. But I’m hoping there’s some way to provide the above approach to classification. It seems more straightforward than embeddings or completion.