I’m seeing a difference in the log probabilities for the same tokens in the response when echo is set to True vs False.
Notice in the example below that the log probability for the token John is -4.1053495 when echo=True and -4.104846 when echo=False.
Is this expected behavior? If so, why would we expect different log probabilities for the individual tokens for exactly the same prompt?
When echo is True:
Request:
{'engine': 'davinci', 'prompt': 'Hello, my name is', 'temperature': 0, 'n': 1, 'max_tokens': 10, 'best_of': 1, 'logprobs': 1, 'stop': None, 'top_p': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'echo': True}
Response:
{
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": {
"text_offset": [
0,
5,
6,
9,
14,
17,
22,
23,
25,
28,
30,
41,
51,
52,
54
],
"token_logprobs": [
null,
-1.7002722,
-3.0985389,
-0.33346418,
-0.05650585,
-4.1053495,
-1.9224068,
-0.6926424,
-1.4471763,
-1.0555193,
-3.2214253,
-1.8482708,
-1.0643281,
-1.137746,
-1.5790797
],
"tokens": [
"Hello",
",",
" my",
" name",
" is",
" John",
".",
" I",
" am",
" a",
" recovering",
" alcoholic",
".",
" I",
" have"
],
"top_logprobs": [
null,
{
",": -1.7002722
},
{
" I": -2.4386003
},
{
" name": -0.33346418
},
{
" is": -0.05650585
},
{
" John": -4.1053495
},
{
".": -1.9224068
},
{
" I": -0.6926424
},
{
" am": -1.4471763
},
{
" a": -1.0555193
},
{
" recovering": -3.2214253
},
{
" alcoholic": -1.8482708
},
{
".": -1.0643281
},
{
" I": -1.137746
},
{
" have": -1.5790797
}
]
},
"text": "Hello, my name is John. I am a recovering alcoholic. I have"
}
],
"created": 1642114600,
"id": "cmpl-4Q3JIcJJLL8nVB0aJwc95qkkIO41x",
"model": "davinci:2020-05-03",
"object": "text_completion",
"request_time": 1.2324142456054688
}
When echo is False:
Request:
{'engine': 'davinci', 'prompt': 'Hello, my name is', 'temperature': 0, 'n': 1, 'max_tokens': 10, 'best_of': 1, 'logprobs': 1, 'stop': None, 'top_p': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'echo': False}
Response:
{
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": {
"text_offset": [
17,
22,
23,
25,
28,
30,
41,
51,
52,
54
],
"token_logprobs": [
-4.104846,
-1.9145488,
-0.69057566,
-1.4444675,
-1.0576655,
-3.235984,
-1.862587,
-1.0654857,
-1.1328539,
-1.5772507
],
"tokens": [
" John",
".",
" I",
" am",
" a",
" recovering",
" alcoholic",
".",
" I",
" have"
],
"top_logprobs": [
{
" John": -4.104846
},
{
".": -1.9145488
},
{
" I": -0.69057566
},
{
" am": -1.4444675
},
{
" a": -1.0576655
},
{
" recovering": -3.235984
},
{
" alcoholic": -1.862587
},
{
".": -1.0654857
},
{
" I": -1.1328539
},
{
" have": -1.5772507
}
]
},
"text": " John. I am a recovering alcoholic. I have"
}
],
"created": 1642114681,
"id": "cmpl-4Q3KbiQV1Jy1Vj7ikyRysoqYwjVPG",
"model": "davinci:2020-05-03",
"object": "text_completion",
"request_time": 0.9291911125183105
}