Some applications of LLMs involve using them to score, not just generate, completions. For example, we may wish to generate several completions under one prompt, and then rank them according to how likely they are under some other prompt.
This is possible using the logprobs option in the completions endpoint, but is not yet supported in the chat-completions endpoint — the only endpoint through which we can access the gpt-3.5 and gpt-4 model families.
Does OpenAI plan to at some point give API access to the log sampling probabilities of gpt-3.5 and gpt-4? Or to continue to improve the davinci-family models which do support this feature? Or has the organization decided to move away from supporting these use cases?
Agreed! logprobs is super important for probabilistic inference. Most distributions libraries provide a pair of methods .sample() to draw new samples and .logprobs() to evaluate existing samples. On top of these two methods one can build lots of probabilistic machinery. We’d love to build that machinery