Logprob value is unbounded, i used sigmoid (converting API logarithm to probabilities)

The logprobs are simply that, probabilities, represented logarithmically (base e).

I think what you wish for is if the API returned the probabilities as they are shown in the playground, and then also gave the total probability coverage each token. We can do that.

some imports

import os
import openai
import json
import math
import copy

A function to accept a completion return, and if it has logprobs, add a new item with the same logprob structure, but with 0-1 probabilities instead of exponents.

def add_probabilities(api_object):
    if not api_object['choices'][0]['logprobs']:
        return api_object
    # Create a deep copy of the api_object so we don't modify the original
    api_object_copy = copy.deepcopy(api_object)
    
    # Iterate over the choices
    for choice in api_object_copy['choices']:
        # Initialize the new "probabilities" dictionary
        probabilities = {}
        
        # Iterate over the "logprobs" dictionary
        for key, value in choice['logprobs'].items():
            # If the key is "tokens" or "text_offset", copy the value as is
            if key in ["tokens", "text_offset"]:
                probabilities[key] = value
            # If the key is "token_logprobs", convert the logprobs to probabilities
            elif key == "token_logprobs":
                probabilities[key] = [math.exp(logprob) for logprob in value]
            # If the key is "top_logprobs", iterate over the list of dictionaries and convert the logprobs to probabilities
            elif key == "top_logprobs":
                probabilities[key] = [{token: math.exp(logprob) for token, logprob in dict_item.items()} for dict_item in value]
        
        # Calculate total probabilities for each dictionary in "top_logprobs"
        probabilities["total_logprobs"] = [sum(dict_item.values()) for dict_item in probabilities["top_logprobs"]]
        
        # Add the "probabilities" dictionary to the choice
        choice['probabilities'] = probabilities
    
    return api_object_copy

a little API function, where you could handle all the errors possible

def do_api(api_prompt):
    api_params = {
    "model": "gpt-3.5-turbo-instruct",
    "max_tokens": 2,
    "top_p": 1e-9,
    "prompt": api_prompt,
    # "logprobs": 5,
    "n": 1,
    }
    try:
        api_response = openai.Completion.create(**api_params)
        return api_response
    except Exception as err:
        print(f"API Error: {str(err)}")

And then our own little playground

openai.api_key = os.getenv("OPENAI_API_KEY")
prompt="User: Say 'hello'.\nAI:"

print("##>", end="")
api_get = do_api(prompt)
api_out = add_probabilities(api_get)
api_string = json.dumps(api_out, indent=2)
#api_string = api_string.replace("\\n", "\n")
api_string = api_string.replace('\\"', '"')
print(api_string)

OUTPUT: what we get for a new “probabilities”, includes:

        "top_logprobs": [
          {
            " Hello": 0.5773516400136589,
            " Hi": 0.19021177923596874,
            "Hello": 0.067644583100199,
            "\n\n": 0.06604950779814563,
            "Hi": 0.025466052201047033
          },
          {
            "!": 0.5264271466063052,
            ".": 0.2591786724769263,
            ",": 0.09482852674076715,
            " there": 0.04379592072803984,
            "!\n": 0.024123977887792435
          }
        ],
        "text_offset": [
          22,
          28
        ],
        "total_logprobs": [
          0.9267235623490194,
          0.9483542444398309

Values for the second token you can see match the playground:

image

1 Like