The logprobs are simply that, probabilities, represented logarithmically (base e).
I think what you wish for is if the API returned the probabilities as they are shown in the playground, and then also gave the total probability coverage each token. We can do that.
some imports
import os
import openai
import json
import math
import copy
A function to accept a completion return, and if it has logprobs, add a new item with the same logprob structure, but with 0-1 probabilities instead of exponents.
def add_probabilities(api_object):
if not api_object['choices'][0]['logprobs']:
return api_object
# Create a deep copy of the api_object so we don't modify the original
api_object_copy = copy.deepcopy(api_object)
# Iterate over the choices
for choice in api_object_copy['choices']:
# Initialize the new "probabilities" dictionary
probabilities = {}
# Iterate over the "logprobs" dictionary
for key, value in choice['logprobs'].items():
# If the key is "tokens" or "text_offset", copy the value as is
if key in ["tokens", "text_offset"]:
probabilities[key] = value
# If the key is "token_logprobs", convert the logprobs to probabilities
elif key == "token_logprobs":
probabilities[key] = [math.exp(logprob) for logprob in value]
# If the key is "top_logprobs", iterate over the list of dictionaries and convert the logprobs to probabilities
elif key == "top_logprobs":
probabilities[key] = [{token: math.exp(logprob) for token, logprob in dict_item.items()} for dict_item in value]
# Calculate total probabilities for each dictionary in "top_logprobs"
probabilities["total_logprobs"] = [sum(dict_item.values()) for dict_item in probabilities["top_logprobs"]]
# Add the "probabilities" dictionary to the choice
choice['probabilities'] = probabilities
return api_object_copy
a little API function, where you could handle all the errors possible
def do_api(api_prompt):
api_params = {
"model": "gpt-3.5-turbo-instruct",
"max_tokens": 2,
"top_p": 1e-9,
"prompt": api_prompt,
# "logprobs": 5,
"n": 1,
}
try:
api_response = openai.Completion.create(**api_params)
return api_response
except Exception as err:
print(f"API Error: {str(err)}")
And then our own little playground
openai.api_key = os.getenv("OPENAI_API_KEY")
prompt="User: Say 'hello'.\nAI:"
print("##>", end="")
api_get = do_api(prompt)
api_out = add_probabilities(api_get)
api_string = json.dumps(api_out, indent=2)
#api_string = api_string.replace("\\n", "\n")
api_string = api_string.replace('\\"', '"')
print(api_string)
OUTPUT: what we get for a new “probabilities”, includes:
"top_logprobs": [
{
" Hello": 0.5773516400136589,
" Hi": 0.19021177923596874,
"Hello": 0.067644583100199,
"\n\n": 0.06604950779814563,
"Hi": 0.025466052201047033
},
{
"!": 0.5264271466063052,
".": 0.2591786724769263,
",": 0.09482852674076715,
" there": 0.04379592072803984,
"!\n": 0.024123977887792435
}
],
"text_offset": [
22,
28
],
"total_logprobs": [
0.9267235623490194,
0.9483542444398309
Values for the second token you can see match the playground: