I’m seeing a difference in the log probabilities for the same tokens in the response when echo
is set to True
vs False
.
Notice in the example below that the log probability for the token John
is -4.1053495
when echo=True
and -4.104846
when echo=False
.
Is this expected behavior? If so, why would we expect different log probabilities for the individual tokens for exactly the same prompt?
When echo
is True
:
Request:
{'engine': 'davinci', 'prompt': 'Hello, my name is', 'temperature': 0, 'n': 1, 'max_tokens': 10, 'best_of': 1, 'logprobs': 1, 'stop': None, 'top_p': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'echo': True}
Response:
{
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": {
"text_offset": [
0,
5,
6,
9,
14,
17,
22,
23,
25,
28,
30,
41,
51,
52,
54
],
"token_logprobs": [
null,
-1.7002722,
-3.0985389,
-0.33346418,
-0.05650585,
-4.1053495,
-1.9224068,
-0.6926424,
-1.4471763,
-1.0555193,
-3.2214253,
-1.8482708,
-1.0643281,
-1.137746,
-1.5790797
],
"tokens": [
"Hello",
",",
" my",
" name",
" is",
" John",
".",
" I",
" am",
" a",
" recovering",
" alcoholic",
".",
" I",
" have"
],
"top_logprobs": [
null,
{
",": -1.7002722
},
{
" I": -2.4386003
},
{
" name": -0.33346418
},
{
" is": -0.05650585
},
{
" John": -4.1053495
},
{
".": -1.9224068
},
{
" I": -0.6926424
},
{
" am": -1.4471763
},
{
" a": -1.0555193
},
{
" recovering": -3.2214253
},
{
" alcoholic": -1.8482708
},
{
".": -1.0643281
},
{
" I": -1.137746
},
{
" have": -1.5790797
}
]
},
"text": "Hello, my name is John. I am a recovering alcoholic. I have"
}
],
"created": 1642114600,
"id": "cmpl-4Q3JIcJJLL8nVB0aJwc95qkkIO41x",
"model": "davinci:2020-05-03",
"object": "text_completion",
"request_time": 1.2324142456054688
}
When echo
is False
:
Request:
{'engine': 'davinci', 'prompt': 'Hello, my name is', 'temperature': 0, 'n': 1, 'max_tokens': 10, 'best_of': 1, 'logprobs': 1, 'stop': None, 'top_p': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'echo': False}
Response:
{
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": {
"text_offset": [
17,
22,
23,
25,
28,
30,
41,
51,
52,
54
],
"token_logprobs": [
-4.104846,
-1.9145488,
-0.69057566,
-1.4444675,
-1.0576655,
-3.235984,
-1.862587,
-1.0654857,
-1.1328539,
-1.5772507
],
"tokens": [
" John",
".",
" I",
" am",
" a",
" recovering",
" alcoholic",
".",
" I",
" have"
],
"top_logprobs": [
{
" John": -4.104846
},
{
".": -1.9145488
},
{
" I": -0.69057566
},
{
" am": -1.4444675
},
{
" a": -1.0576655
},
{
" recovering": -3.235984
},
{
" alcoholic": -1.862587
},
{
".": -1.0654857
},
{
" I": -1.1328539
},
{
" have": -1.5772507
}
]
},
"text": " John. I am a recovering alcoholic. I have"
}
],
"created": 1642114681,
"id": "cmpl-4Q3KbiQV1Jy1Vj7ikyRysoqYwjVPG",
"model": "davinci:2020-05-03",
"object": "text_completion",
"request_time": 0.9291911125183105
}