Confused about max_tokens - parameter with GTP4-turbo (128k-tokenUsedForPrompt or 4K)

christian.kiefer · November 16, 2023, 8:09am

Hi there,

the documentation says:

max_tokens - integer or null
Optional
Defaults to inf

The maximum number of tokens to generate in the chat completion.

The total length of input tokens and generated tokens is limited by the model’s context length (How to count tokens with tiktoken | OpenAI Cookbook) for counting tokens.

When using the GPT4-turbo with a context of max 128k token and a max response of 4k token - what value to use in the rest call???

Should max_tokens be
128k - {token used for prompt}
or 4k? (Just the max_tokens for the response)?

Foxalabs · November 16, 2023, 8:18am

There is usually no need to set max_tokens. Unless you are doing something very specific or you are evaluating some aspect of the model or perhaps using the instruct model which has a legacy 200 token default then omitting the max tokens parameter is the way to go.

christian.kiefer · November 16, 2023, 8:40am

In our application, max_token is to be used as a “security mechanism” in order to be able to set a clear cost limit. We would therefore like to give users the option of setting the value.

Foxalabs · November 16, 2023, 8:44am

Then you will need to perform some dynamic token counting with tiktoken to calculate an appropriate value.

The Value should be

Current prompt Model max token context - tiktoken prompt token count - some small margin for system tokens… say 15 - your limiting amount

christian.kiefer · November 16, 2023, 8:45am

Okay… got it:

“message” → “max_tokens is too large: 126099. This model supports at most 4096 completion tokens, whereas you provided 126099.”

So in the new models the max_tokens seem to be for response only and in the old ones it is for promt and response…

Foxalabs · November 16, 2023, 8:51am

It’s been so long since I used it. These days I stream everything and if I needed to implement some kind of limit I can just close the connection when I reach my token count, the model will rattle off a few more tokens… usually 7-15 while it detects the connection is closed and thats it.

christian.kiefer · November 16, 2023, 9:35am

@Foxalabs - A question on another topic.

We generate an output in a json format and have the problem that the output length can be longer than the token limit allows.

In the json format n elements are generated.

Is it possible to tell GPT that n elements should be generated, but only as many elements as the token limit allows, so that a valid JSON can still be generated and sent as a response?

Foxalabs · November 16, 2023, 9:40am

Not reliably, the GPT series of models use a feed forwards network, they are not aware of what they have generated until they have generated it.

When I’m faced with a requirement like this I look for a way to split the request into sections, each one well within the limits of the models input and output and then use traditional code to concatenate the outputs or otherwise process the results into a larger whole once finished.

christian.kiefer · November 16, 2023, 10:04am

Then I run into the following problem:
Example:
I want 100 short biographies of the most important musicians of the 90s.
If I split this and always ask after 10 questions, I always get (partial) short biographies of the same musicians. How do you solve this problem?

Foxalabs · November 16, 2023, 10:13am

Use the large 128K input context to show the model which entries have already been processed and instruct the model to avoid using the listed entries for new generation.

aubrey_ghose · December 5, 2023, 3:50pm

I had access to get 4 and turbo but now I don’t, can I have it back?

Foxalabs · December 6, 2023, 8:33am

Is this for the API or for ChatGPT? If it’s for the API your account will need credit applied to it, if it’s ChatGPT you will have to check if you have Plus membership, if not you will have to wait for Plus memberships to start accepting new users again.

aubrey_ghose · December 6, 2023, 3:17pm

Ta - I have ChatGPT Plus membership under the only subscription option offered to me so far, I did had both GPT4 and turbo again this morning which was great but it only lasted an hour or so - so like everyone who gets a privilege, it herts all the more when its taken away . What would this User have to do to obtain it permanently or to get a response from Enterprise as I want to commission an AI as a developer?

Foxalabs · December 8, 2023, 6:57am

GPT-4 API access is not taken away once you have made a $5 API credit payment, it will still be there if you look on the playground under chat mode and show all models https://platform.openai.com/playground?mode=chat

aubrey_ghose · December 8, 2023, 10:32am

Thank you Spencer. Appreciated your advice - super useful, Ta!

Bamba · May 11, 2024, 1:40pm

These days I stream everything and if I needed to implement some kind of limit I can just close the connection when I reach my token count, the model will rattle off a few more tokens… usually 7-15 while it detects the connection is closed and thats it.

This is very unpolite, they can implement auto-ban for unnecessary server resource use, it is called “resource leakage” and if I were AI I’d ban you for few minutes after finding it was not accidental

_j · May 11, 2024, 1:51pm

AI doesn’t care if you are rude to it and close the connection while it is responding.

Twice in the last day I’ve been more rude back “I stopped your response because you were being a dummy”. The matrix math forgets you the second a token dictionary is generated.

OpenAI made the decision and the implementation to also stop the AI model generation instead of letting it complete and giving you the full bill (like were you to close the connection on a non-streaming AI API call).

Topic		Replies	Views
Max tokens - how to get GPT to use the maximum available tokens? API gpt-35-turbo , api	5	10916	December 19, 2023
Gpt4 token usage not using more than 3000 tokens even though it’s listed at much higher availability API	12	1787	December 17, 2023
GPT-4 128K only has 4096 completion tokens API gpt-4	9	25156	February 27, 2024
I need help using openai API API chatgpt , gpt-4o-mini	2	103	October 29, 2024
Not allowed to have all 8192 tokens API gpt-4	16	9373	December 18, 2023

Confused about max_tokens - parameter with GTP4-turbo (128k-tokenUsedForPrompt or 4K)

Related Topics