Hi,
I am using completion API to make a text summarizer which summarize the text on current web page.
Request:
{“model”:“text-davinci-003”,“prompt”:“Some long paragraph.\n\nTLDR; Bot\n\nMe\n\nsummarize in 20 words\n\n6:35:30 pm\n\nSend\n$Request failed with status code 400 \n\n summarize",“echo”:false,“max_tokens”:1000,“n”:1,“temperature”:1,“user”:"user@gmail.com”}
Error:
{
“error”: {
“message”: “This model’s maximum context length is 4097 tokens, however you requested 4121 tokens (3121 in your prompt; 1000 for the completion). Please reduce your prompt; or completion length.”,
“type”: “invalid_request_error”,
“param”: null,
“code”: null
}
}
Please help me how I can correct this? Also, can I maintain the conversation session for having QnA on this text?
The error explained enough. You are using davinci-003 model with a 4097 token limitation (still more than the others). There are 3121 tokens in your text and prompt combined, and you set the max token (which is the maximum number of tokens in the response) to 1000. this exceeds the 4097 tokens which in the combination of both input and output while calling this endpoint.
simply reduce the max_tokens from 1000 to less (let’s say 500) (especially if as stated in your prompt you want a 20 words summary) since you don’t need it anyways and then send your request.
1 Like
Thanks @kian.aghaei
Also, can I maintain the conversation session for having QnA on this text?
I am not sure what you mean by maintaining the conversation; regarding the above example, however, you can always give back the result of the previous call to the generation API and make another call, but the 4097 limit remains. So if you want to have a conversation and if that conversation exceeds the limited number of tokens, then you have to drop from the top of the queue.
@kian.aghaei Is there any way for it to summarize text longer than the maximum 4097 tokens?