How to tell when token context length is exceeded due to the response? If the input alone exceeds it, OpenAI responds with error code context_length_exceeded
. But it seems that if it is exceeded while generating a response, it just truncates the response and does not throw an error, which causes a lot of downstream problems.
You could create a script that broadcasts an error message if token length is exceeded.
Welcome to the OpenAI community @ball
Here’s how the chat completion response looks like:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo-0613",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "\n\nHello there, how may I assist you today?",
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
If the response was cut-off due to exceeded context length during generation, the property finish_reason
will have the value length
.
If everything goes right, the value will be stop
This is the correct answer:
" If the response was cut-off due to exceeded context length during generation, the property finish_reason
will have the value length
."
This is helpful. Thank you. Now I need to figure out how finish_reason
gets exposed in Langchain. Seems like there is some information at github /langchain-ai/langchainjs/issues/2099 but I haven’t been able to get to the right data yet. Anyway, thank you for your help.