Maximum Context Length Error across different models

So, I am sending the same prompt:

GPT 4 I get this: textPayload: " message: “This model’s maximum context length is 8192 tokens. However, you requested 8808 tokens (617 in the messages, 8191 in the completion). Please reduce the length of the messages or completion.”,"

gpt-3.5-turbo-16k = textPayload: “Error in chatGPTPrompt function: BadRequestError: 400 This model’s maximum context length is 16385 tokens. However, you requested 17001 tokens (617 in the messages, 16384 in the completion). Please reduce the length of the messages or completion.”

Why is this happening? Why is it trying to use the max amount of tokens, rather than the amount it needs? Or is this a bug? Because it was working perfectly a few days ago, with gpt-4 turbo

Hey! Can you give me an example of the prompt that is causing this?

Also, to be clear, every model has a different context window length. GPT-4 right now is defaulted to 8k, GPT 3.5 Turbo 16k has 16k, and GPT-4 Turbo has 128k. So if it worked with 4-turbo, that does not mean it will work with the other models since they have lower context.

Hi @shevvydj

Are you passing the max_tokens in the request?

As Logan mentioned, we’ll be able to help you better if you were to provide the code to reproduce the error.

You are using the max_tokens API parameter incorrectly.

You do not set it to the maximum context length of the AI model.

You set it to the maximum portion of that reserved for forming a response.