Max tokens of Azure Open AI model

I printed the below

print(AzureOpenAI(deployment_name=xx, temperature=0.2)
it shows max_token as 256. But when I get the response, I see tokens >1000. How is that possible. Does the model then ignore max token


Are you setting max_tokens? API Reference - OpenAI API

No I am not, it is the default 256 tokens

No, max_tokens default is infinite. Below is from the documentation.

max_tokens integer Optional Defaults to inf

The maximum number of tokens to generate in the chat completion.

The total length of input tokens and generated tokens is limited by the model’s context length.

This is very confusing . The below link says it defaults to 16

That is for Create Completion not Chat. Scroll down further to Chat

I am using completion(text-davinci) only not chat. Davinci is completion, right?

So there’s these APIs → this one is for DaVinci Completions. → this one is for,if you want to simulate the experience that you get from using the ChatGPT bot. Remember, its a full fledged bot, so it will have interactions. Completions might be just very straightforward.

Either way. Both need excellent prompts.
I learned this difference the hard way.

Maybe in your payload prompt the number of tokens exceeds 1000.
I encountered this message once, when I set the token max to 2048. In reality it is your prompt tokens + x <= 2048. x is task tokens.