**High-level problem**

Say my input `text`

gets tokenized into `[t1, t2, t3, t4]`

. Can I use the API to estimate the log-probability of `text`

as

log Pr(`t1, t2, t3, t4`

) ?

If this estimation isn’t already implemented in the API, is there a way to estimate this type of conditional probability:

log Pr(`t3, t4`

| `t1, t2`

) = log Pr(`t3`

| `t1, t2`

) + log Pr(`t4`

| `t1, t2, t3`

) ?

In this conditional probability, I’m the one providing the hypothesized completion text `[t3, t4]`

, not GPT-3. Setting the `logprobs`

argument isn’t sufficient b/c (1) GPT-3 may not happen to sample `t3`

or `t4`

or (2) the response may exclude `t3`

or `t4`

if they aren’t among the top `logprobs`

tokens (at their positions). That’s why the Completion endpoint doesn’t seem sufficient for my problem.

**Motivation**

I’d like to evaluate a simple approach to language classification tasks, specifically where labels are textually meaningful. For example, in a sentiment classification task, an alternative to text completion is to estimate the probability of an inputted, completed text—

```
The sentiment of this tweet
"""
I loved the new Batman movie!
"""
is {sentiment}.
```

—where `{sentiment}`

is replaced w/ `positive`

, `negative`

, or `neutral`

, and then return the sentiment which gave the highest log-probability.

Has this method been evaluated anywhere? (IIUC this approach is quite different than the deprecated Classification endpoint.)

I suppose OpenAI won’t release probabilities too liberally b/c a user could train on them and compete w/ GPT-3 or something. But I’m hoping there’s some way to provide the above approach to classification. It seems more straightforward than embeddings or completion.