OpenAI API error message vs. website documentation list different max context limits

I get the below error when overloading the OpenAI API endpoint for using gpt-3.5-turbo, where it says the maximum prompt (“context”) length is 4097 tokens.

However, the documentation says that the context limit is 4096.

I’ve encountered the same discrepancy for gpt-3.5-turbo-0301 and gpt-3.5-turbo-0613 as well.

My question is, which one – 4096 or 4097 – should I use as a constant in my code?

I would prefer to use the API error message number but I know that when the documentation changes, I would overwrite my constants with whatever values I see there – hence the question.

The API error message seems to be the more accurate one because I can send exactly 4097 context tokens and then receive a response (“completion”) that’s 1 token long, resulting in a total token count of 4098:

(Can’t include second screenshot since new users are limited to one per post)

I’m thinking the website documentation has a typo on it, since the max context token limit for other models matches the error message I get when overloading it. In which case, is this the avenue for reporting bugs?

But if it’s not a typo, why would the numbers be different?

Follow-up screenshot of sending exactly 4097 prompt tokens, despite website saying limit of 4096: