Tokens limit gpt-3.5-turbo-0125

Hello, we want to use gpt-3.5-turbo-0125 in our application. We are not sure about the maximum token limit (Request + Response) for this model. https://platform.openai.com/docs/models/gpt-3-5-turbo here I can see that the maximum output tokens is 4k and the context window is 16k. Does this mean that we can have maximum 20k tokens (16k input and 4k output) or the 16k include the output too?

the 16k includes the 4k :slight_smile:

E.g.: If you have 15k context, you can only generate an additional 1k.

    flowchart
1 Like