Bytes:\\xe2\\x80 - parsing response in logprobs

I am getting weird responses from the completions API, something to do with text encodings and strange characters. Here is an example using 66 99 quotation marks to demonstrate.:

  import os
  import openai
  openai.api_key = os.getenv("OPENAI_API_KEY")

  resp = openai.Completion.create(
    model="text-davinci-003",
    prompt="Test of quote: ‟close",
    max_tokens=7,
    temperature=0,
    logprobs=5
  )


In [6]: resp['choices'][0]['logprobs']['top_logprobs'][3]
Out[6]:
<OpenAIObject at 0x103cd8180> JSON: {
  "\n": -4.0073056,
  "!": -3.8053296,
  "\"": -2.446624,
  ".": -2.2706704,
  "bytes:\\xe2\\x80": -0.2847361
}

How do I parse “bytes:\xe2\x80”

At present I only have one note, one observation.


Note: U+E280 Private Use Character


Observation:
This is JSON key-value pair

  "\n": -4.0073056,
  "!": -3.8053296,
  "\"": -2.446624,
  ".": -2.2706704,
  "bytes:\\xe2\\x80": -0.2847361

So the keys are "\n", "!", "\"", "." and "bytes:\\xe2\\x80". If you know how to translate the other keys to their meaning then the bytes should be in the same table.


2 Likes