How to understand new model limits? turbo

Hello.
gpt3.5 turbo is limited to 4096 tokens. But how to understand this limit?
Is it limit for last message? Or the whole context: system+all users and all assistant messages?

If it only for the latest message. How big can be context (chat history)?

It is for all the tokens in the messages array and the completion.

The usage is in the response so you can double check and confirm.

:slight_smile:

1 Like