Top K Log Probabilities are not aligned with actual response token

Hi there,

I’m working on a prompt that relies on getting log probabilities from certain tokens, more specifically tokens that are integers from 1 to 5. The goal here is to calculate a weighted score by taking the log probabilities of the the integer tokens. I’m getting an error because sometimes the top k log probabilities associated with the integer token are not the correct ones. Here’s an example we have been getting. The following code:

print(
  response.choices[0].logprobs.content[i].token.upper(),
  response.choices[0].logprobs.content[i].top_logprobs,
)
print(
  response.choices[0].logprobs.content[i + 1].token.upper(),
  response.choices[0].logprobs.content[i + 1].top_logprobs,
)
print(
  response.choices[0].logprobs.content[i + 2].token.upper(),
  response.choices[0].logprobs.content[i + 2].top_logprobs,
)
print(
  response.choices[0].logprobs.content[i + 3].token.upper(),
  response.choices[0].logprobs.content[i + 3].top_logprobs,
)

prints the following:

_VALUE [TopLogprob(token='_value', bytes=[95, 118, 97, 108, 117, 101], logprob=0.0), TopLogprob(token='!', bytes=[33], logprob=-100.0), TopLogprob(token='"', bytes=[34], logprob=-100.0), TopLogprob(token='#', bytes=[35], logprob=-100.0), TopLogprob(token='$', bytes=[36], logprob=-100.0), TopLogprob(token='%', bytes=[37], logprob=-100.0), TopLogprob(token='&', bytes=[38], logprob=-100.0), TopLogprob(token="'", bytes=[39], logprob=-100.0), TopLogprob(token='(', bytes=[40], logprob=-100.0), TopLogprob(token=')', bytes=[41], logprob=-100.0)]

": [TopLogprob(token='":', bytes=[34, 58], logprob=0.0), TopLogprob(token='"', bytes=[34], logprob=-21.25), TopLogprob(token='":\n\n', bytes=[34, 58, 10, 10], logprob=-22.375), TopLogprob(token='":\n', bytes=[34, 58, 10], logprob=-25.375), TopLogprob(token='"\n\n', bytes=[34, 10, 10], logprob=-28.3125), TopLogprob(token='":\r\n', bytes=[34, 58, 13, 10], logprob=-29.0625), TopLogprob(token='"\n', bytes=[34, 10], logprob=-31.375), TopLogprob(token='"\r\n', bytes=[34, 13, 10], logprob=-37.140625), TopLogprob(token='"\n\n\n', bytes=[34, 10, 10, 10], logprob=-38.5234375), TopLogprob(token='"\r\n\r\n', bytes=[34, 13, 10, 13, 10], logprob=-39.2548828125)]

5 [TopLogprob(token='":', bytes=[34, 58], logprob=0.0), TopLogprob(token='"', bytes=[34], logprob=-21.25), TopLogprob(token='":\n\n', bytes=[34, 58, 10, 10], logprob=-22.375), TopLogprob(token='":\n', bytes=[34, 58, 10], logprob=-25.375), TopLogprob(token='"\n\n', bytes=[34, 10, 10], logprob=-28.3125), TopLogprob(token='":\r\n', bytes=[34, 58, 13, 10], logprob=-29.0625), TopLogprob(token='"\n', bytes=[34, 10], logprob=-31.375), TopLogprob(token='"\r\n', bytes=[34, 13, 10], logprob=-37.140625), TopLogprob(token='"\n\n\n', bytes=[34, 10, 10, 10], logprob=-38.5234375), TopLogprob(token='"\r\n\r\n', bytes=[34, 13, 10, 13, 10], logprob=-39.2548828125)]

," [TopLogprob(token=',"', bytes=[44, 34], logprob=0.0), TopLogprob(token=',', bytes=[44], logprob=-20.25), TopLogprob(token=' ,"', bytes=[32, 44, 34], logprob=-22.25), TopLogprob(token='\t', bytes=[9], logprob=-27.375), TopLogprob(token=',\n', bytes=[44, 10], logprob=-28.4375), TopLogprob(token='\n\n', bytes=[10, 10], logprob=-28.4375), TopLogprob(token=' ', bytes=[32], logprob=-29.4375), TopLogprob(token=' ,', bytes=[32, 44], logprob=-30.0625), TopLogprob(token='\n', bytes=[10], logprob=-30.625), TopLogprob(token='0', bytes=[48], logprob=-30.6875)]

For some reason the top k log probabilities belonging to token ’ 5 ’ is the same as the log probabilities belonging to the token ’ ": ’

1 Like

encountered the same issue. logprob seems to be completely off. gpt-4o-2024-11-20 seems fine

Here’s gpt-4oOutput is only a one-stanza creative poem, of any topic
temperature=1.3,
top_p=0.99

Selected Token Logprob Top Alternatives (token, logprob)
B -1.6004 In, -0.3504
B, -1.6004
Wh, -3.3504
Under, -3.8504
Am, -4.1004
Upon, -4.6004
A, -5.1004
On, -7.1004
Through, -7.3504
Moon, -7.6004
Stars, -8.1004
Within, -8.3504
Among, -8.3504
The, -8.6004
Lost, -8.9754
Beyond, -8.9754
Silent, -8.9754
Between, -9.2254
W, -9.3504
At, -9.3504
ene -9999.0000 In, -0.3504
B, -1.6004
Wh, -3.3504
Under, -3.8504
Am, -4.1004
Upon, -4.6004
A, -5.1004
On, -7.1004
Through, -7.3504
Moon, -7.6004
Stars, -8.1004
Within, -8.3504
Among, -8.3504
The, -8.6004
Lost, -8.9754
Beyond, -8.9754
Silent, -8.9754
Between, -9.2254
W, -9.3504
At, -9.3504
ath -9999.0000 In, -0.3504
B, -1.6004
Wh, -3.3504
Under, -3.8504
Am, -4.1004
Upon, -4.6004
A, -5.1004
On, -7.1004
Through, -7.3504
Moon, -7.6004
Stars, -8.1004
Within, -8.3504
Among, -8.3504
The, -8.6004
Lost, -8.9754
Beyond, -8.9754
Silent, -8.9754
Between, -9.2254
W, -9.3504
At, -9.3504
the -9999.0000 In, -0.3504
B, -1.6004
Wh, -3.3504
Under, -3.8504
Am, -4.1004
Upon, -4.6004
A, -5.1004
On, -7.1004
Through, -7.3504
Moon, -7.6004
Stars, -8.1004
Within, -8.3504
Among, -8.3504
The, -8.6004
Lost, -8.9754
Beyond, -8.9754
Silent, -8.9754
Between, -9.2254
W, -9.3504
At, -9.3504
quiet -9999.0000 In, -0.3504
B, -1.6004
Wh, -3.3504
Under, -3.8504
Am, -4.1004
Upon, -4.6004
A, -5.1004
On, -7.1004
Through, -7.3504
Moon, -7.6004
Stars, -8.1004
Within, -8.3504
Among, -8.3504
The, -8.6004
Lost, -8.9754
Beyond, -8.9754
Silent, -8.9754
Between, -9.2254
W, -9.3504
At, -9.3504
whispers -3.3943 moon, -0.3943
,, -3.1443
whisper, -3.1443
canopy, -3.3943
whispers, -3.3943
stars, -3.8943
veil, -4.1443
night's, -4.1443
glow, -4.3943
night, -4.6443
silver, -4.6443
sky, -4.8943
twilight, -4.8943
of, -5.1443
st, -5.1443
midnight, -5.1443
e, -5.3943
autumn, -5.3943
willow, -5.3943
b, -5.7693
of -0.0009 of, -0.0009
,, -7.1259
in, -10.0009
where, -10.3759
night, -11.8759
on, -12.7509
spun, -13.0009
', -13.3759
that, -13.6259
the, -14.0009
trees, -14.0009
soft, -14.1884
moon, -14.3134
and, -14.3759
', -15.1259
dusk, -15.1259
low, -15.1884
sung, -15.2509
o, -15.3134
from, -15.3134
the -0.0254 the, -0.0254
moon, -4.9004
twilight, -5.1504
a, -5.4004
ancient, -5.9004
autumn, -7.1504
night, -7.5254
st, -7.6504
silver, -7.6504
an, -7.7754
midnight, -7.7754
old, -8.0254
night's, -8.5254
dusk, -8.9004
dawn, -9.1504
evening, -9.4004
falling, -9.5254
star, -9.7754
trees, -9.7754
stars, -10.1504
moon -0.5417 the, -0.0254
moon, -4.9004
twilight, -5.1504
a, -5.4004
ancient, -5.9004
autumn, -7.1504
night, -7.5254
st, -7.6504
silver, -7.6504
an, -7.7754
midnight, -7.7754
old, -8.0254
night's, -8.5254
dusk, -8.9004
dawn, -9.1504
evening, -9.4004
falling, -9.5254
star, -9.7754
trees, -9.7754
stars, -10.1504
lit -1.6864 ,, -0.9364
's, -0.9364
lit, -1.6864
’s, -3.6864
light, -5.3114
-k, -8.0614
-lit, -9.4364
so, -9.5614
beam, -9.5614
less, -9.8114
-t, -10.3114
rise, -11.0614
shine, -11.3114
,\n, -11.6864
,\\, -11.6864
-gl, -12.4364
ag, -12.6864
\n, -12.6864
-so, -12.8114
,\\\n, -12.8114
night -2.8211 trees, -1.0711
tide, -1.3211
sea, -2.0711
night, -2.8211
sky, -2.9461
stream, -3.4461
leaves, -3.6961
grove, -3.6961
breeze, -4.1961
gl, -4.4461
pine, -4.9461
lake, -5.0711
skies, -5.4461
tree, -5.8211
glow, -6.0711
seas, -6.0711
streams, -6.1961
tides, -6.1961
p, -6.5711
waves, -6.5711
, -0.0002 ,, -0.0002
,\\\n, -9.2502
’s, -10.5002
sky, -10.5002
,<, -12.8752
,\\, -13.7502
skies, -14.0002
's, -14.1252
\n, -14.1252
,\n, -14.2502
so, -14.6252
,*, -15.0002
,/, -15.1252
., -15.7502
fair, -16.6252
vast, -16.7502
divine, -17.0002
glow, -17.1252
, -17.2502
and, -17.6252
\n -0.0004 ,, -0.0002
,\\\n, -9.2502
’s, -10.5002
sky, -10.5002
,<, -12.8752
,\\, -13.7502
skies, -14.0002
's, -14.1252
\n, -14.1252
,\n, -14.2502
so, -14.6252
,*, -15.0002
,/, -15.1252
., -15.7502
fair, -16.6252
vast, -16.7502
divine, -17.0002
glow, -17.1252
, -17.2502
and, -17.6252
Dream -9999.0000 ,, -0.0002
,\\\n, -9.2502
’s, -10.5002
sky, -10.5002
,<, -12.8752
,\\, -13.7502
skies, -14.0002
's, -14.1252
\n, -14.1252
,\n, -14.2502
so, -14.6252
,*, -15.0002
,/, -15.1252
., -15.7502
fair, -16.6252
vast, -16.7502
divine, -17.0002
glow, -17.1252
, -17.2502
and, -17.6252
s -0.0006 s, -0.0006
ers, -8.0006
ing, -9.0006
sc, -9.8756
y, -11.6256
t, -11.8756
-l, -12.1256
-we, -12.5006
like, -12.6256
we, -13.0006
catch, -13.1256
we, -13.3756
's, -13.3756
-k, -13.3756
-t, -13.6256
er, -13.8756
-d, -13.8756
shadows, -14.0006
-filled, -14.0006
and, -14.1256
unf -2.8392 weave, -1.0892
dance, -1.2142
take, -2.2142
unf, -2.8392
w, -3.2142
unfold, -3.5892
drift, -4.3392
tip, -4.4642
sail, -4.4642
flutter, -4.5892
cascade, -4.7142
gather, -4.8392
like, -4.9642
unravel, -5.0892
wander, -5.3392
are, -5.4642
pir, -5.7142
woven, -5.7142
float, -5.8392
tw, -5.9642
url -0.0002 url, -0.0002
ur, -8.5002
etter, -14.0002
urls, -14.0002
ath, -16.6252
url, -17.8752
URL, -18.3752
azed, -18.5002
ollow, -18.8752
Url, -19.1252
urred, -19.5002
ound, -19.8752
older, -20.3752
elt, -20.6252
orld, -20.8752
alter, -20.8752
ail, -21.0002
ledged, -21.1252
.url, -21.3752
_url, -21.5002
like -0.2677 like, -0.2677
their, -1.8927
on, -2.7677
in, -4.3927
with, -5.3302
as, -6.3927
,, -6.9552
softly, -7.5177
gently, -8.3927
and, -8.4552
upon, -8.5802
where, -8.6427
ing, -9.3927
wings, -9.9552
soft, -10.4552
gentle, -11.5177
from, -11.6427
silver, -12.0802
across, -12.2052
within, -12.3302
petals -9999.0000 like, -0.2677
their, -1.8927
on, -2.7677
in, -4.3927
with, -5.3302
as, -6.3927
,, -6.9552
softly, -7.5177
gently, -8.3927
and, -8.4552
upon, -8.5802
where, -8.6427
ing, -9.3927
wings, -9.9552
soft, -10.4552
gentle, -11.5177
from, -11.6427
silver, -12.0802
across, -12.2052
within, -12.3302
in -0.5631 like, -0.2677
their, -1.8927
on, -2.7677
in, -4.3927
with, -5.3302
as, -6.3927
,, -6.9552
softly, -7.5177
gently, -8.3927
and, -8.4552
upon, -8.5802
where, -8.6427
ing, -9.3927
wings, -9.9552
soft, -10.4552
gentle, -11.5177
from, -11.6427
silver, -12.0802
across, -12.2052
within, -12.3302
the -0.8229 the, -0.8229
soft, -1.5729
a, -2.1979
gentle, -2.4479
silver, -2.6979
sil, -2.9479
their, -4.8229
delicate, -5.0729
celestial, -5.9479
sl, -6.1979
eth, -6.1979
tender, -6.8229
serene, -6.8229
twilight, -6.8229
secret, -7.0729
pale, -7.0729
dawn, -7.0729
an, -7.4479
silent, -7.4479
luminous, -7.4479
silver -1.3171 gentle, -1.0671
silver, -1.3171
soft, -1.5671
sil, -2.8171
cool, -3.8171
tender, -3.8171
dew, -4.6921
velvet, -5.0671
still, -5.1921
star, -5.5671
garden, -5.6921
hush, -5.6921
pale, -5.8171
softened, -5.8171
dark, -6.1921
tranquil, -6.1921
dim, -6.3171
breeze, -6.5671
st, -6.6921
glow, -6.6921
light -0.1399 light, -0.1399
ed, -2.1399
glow, -4.8899
light, -6.5149
's, -7.2649
dew, -7.7649
gle, -7.8899
-light, -8.1399
breeze, -8.5149
'd, -8.6399
sheen, -8.6399
n, -8.7649
ing, -8.7649
-t, -8.8899
twilight, -9.3899
flight, -9.5149
hue, -9.5149
-h, -9.7649
-white, -9.8899
sea, -10.0149
. -2.2616 ,, -0.1366
., -2.2616
;, -3.7616
, -8.5116
's, -12.2616
\n, -13.2616
:, -14.3866
’s, -14.8866
,\n, -16.0116
, -16.0116
—a, -16.1366
\\xcd, -16.2616
,—, -16.2616
––, -17.1991
.\n, -17.3241
so, -17.4491
wide, -17.9491
.,, -18.3241
, -18.3241
,*, -18.3866
\n -0.0002 ,, -0.1366
., -2.2616
;, -3.7616
, -8.5116
's, -12.2616
\n, -13.2616
:, -14.3866
’s, -14.8866
,\n, -16.0116
, -16.0116
—a, -16.1366
\\xcd, -16.2616
,—, -16.2616
––, -17.1991
.\n, -17.3241
so, -17.4491
wide, -17.9491
.,, -18.3241
, -18.3241
,*, -18.3866
Stars -9999.0000 ,, -0.1366
., -2.2616
;, -3.7616
, -8.5116
's, -12.2616
\n, -13.2616
:, -14.3866
’s, -14.8866
,\n, -16.0116
, -16.0116
—a, -16.1366
\\xcd, -16.2616
,—, -16.2616
––, -17.1991
.\n, -17.3241
so, -17.4491
wide, -17.9491
.,, -18.3241
, -18.3241
,*, -18.3866
, -4.1830 ,, -0.1366
., -2.2616
;, -3.7616
, -8.5116
's, -12.2616
\n, -13.2616
:, -14.3866
’s, -14.8866
,\n, -16.0116
, -16.0116
—a, -16.1366
\\xcd, -16.2616
,—, -16.2616
––, -17.1991
.\n, -17.3241
so, -17.4491
wide, -17.9491
.,, -18.3241
, -18.3241
,*, -18.3866
like -9999.0000 ,, -0.1366
., -2.2616
;, -3.7616
, -8.5116
's, -12.2616
\n, -13.2616
:, -14.3866
’s, -14.8866
,\n, -16.0116
, -16.0116
—a, -16.1366
\\xcd, -16.2616
,—, -16.2616
––, -17.1991
.\n, -17.3241
so, -17.4491
wide, -17.9491
.,, -18.3241
, -18.3241
,*, -18.3866
ancient -9999.0000 ,, -0.1366
., -2.2616
;, -3.7616
, -8.5116
's, -12.2616
\n, -13.2616
:, -14.3866
’s, -14.8866
,\n, -16.0116
, -16.0116
—a, -16.1366
\\xcd, -16.2616
,—, -16.2616
––, -17.1991
.\n, -17.3241
so, -17.4491
wide, -17.9491
.,, -18.3241
, -18.3241
,*, -18.3866
poets -4.4052 secrets, -1.2802
stories, -2.0302
guardians, -2.0302
storyt, -2.2802
tales, -2.4052
eyes, -3.1552
echoes, -3.1552
keep, -3.9052
lantern, -4.0302
sent, -4.1552
poets, -4.4052
wander, -4.6552
run, -4.9052
guides, -4.9052
watchers, -4.9052
souls, -5.0302
sages, -5.0302
scrib, -5.1552
watch, -5.2802
travelers, -5.2802

The major fault: while high-certainty tokens, like the rest of a word, will deliver 0.0000 on gpt-4o-mini (also bad), here, we get the -9999 written for that 100% probability.
-9999 is also coming about for those with no entry in the top list, where normally top-20 would be extended to 21 if there was an unlikely sampled token.

It appears that 100% token runs of a word are the first token, repeated.

Near the bottom, a comma is -4.1830, but a comma is -0.1366 in the top list.
“in” -0.5631, but -4.3927 in the top_logprobs.

OpenAI uses a different softmax or filtered dictionary to provide YOU a different false result, one without special tokens >200000 for ending output, for example. If the chance of a stop sequence was 40%, what you receive is 100% of the smaller set. Perhaps we see the difference between reality and logprobs being fiction in these errors.

Multi-faceted busted logprobs.

that has also seemed to fix the issue