Token limits GT-4OMINI model-2024-07-18

olgak007 · December 11, 2024, 2:07pm

I work with the GT-4OMINI model-2024-07-18 . I want to transmit a large amount of text for analysis and unification. I can’t figure out what the token limits are when transferring via the api? For example, I can:

transfer 100000 tokens to the entrance.
get 20000 tokens at the exit?

PaulBellow · December 11, 2024, 6:06pm

Welcome to the community, @olgak007 !

It depends on your usage tier. You can find out more here…

https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-two

olgak007 · December 12, 2024, 7:06am

I’ve seen these limits. Thanks!
I don’t understand the maximum size of a one-time packet for input and output. I understand correctly:
The maximum input: 128,000 tokens
The maximum output: 16384 tokens
Amount for input and output: 128000+16384= 144384
https://platform.openai.com/docs/models#gpt-4o-mini
Or does using the API give you any additional restrictions? For example, no more than 4000 tokens in total for entry and exit?

_j · December 12, 2024, 7:50am

The count of 128000 (125k, in fact) is the total combined input and output. It is the model’s context window length, an AI inference memory for both placing input and formation of a continuation ouput. Therefore, if you were to allow 3k tokens for a response (a length that is typical of what the model will produce before the output becomes questionable or against training), then you’ll have 122k of input space.

In practice, such a large input is not followed well. The attention mechanism rather works like a retrieval that can only extract certain facts at a time, and proximal reward learning on chat styles means it is more focused on the initial and final messages as what delivers reward-gathering results from that training.

The output limitatition is a model limit where it will be absolutely cut off, and where you can’t specify a larger cutoff. I’ve never approached that value without the output being terminated - the AI is not trained for writing or rewriting book chapters. The output being formed also acts as more input for the next recursive token generated one-at-a-time.

Don’t think of an API request as a “packet” - think of it as loading that input context into the otherwise stateless model, and then running the generation of language that appears after that (assistant response) until the AI is done writing its thought or done following the instruction and decides to stop, or your max_completion_tokens value is reached.

Topic		Replies	Views
Inputs tokens limit, data extraction API gpt-4 , gpt-35-turbo , api , token , rate-limit	2	4682	February 3, 2024
Token limit -API for input API token	1	658	November 23, 2024
Tokens limit gpt-3.5-turbo-0125 API token , gpt-0125	1	3674	February 15, 2024
Regarding max input tokens of gpt-4o-2024-08-06 API gpt-4	3	2219	December 2, 2024
Gpt-4o total token limits? API token , gpt-4o	1	2146	September 9, 2024

Token limits GT-4OMINI model-2024-07-18

Related topics