hello, I have a question about the token limit. Since more and more models can now put tokens into context, I ask myself why the output tokens remain limited?
How does the output token limit come about? I always thought the tokens were counted like this: input+output.
But that’s apparently not the case, why is that the case, what is the limiting factor?
If it’s the computing power, then you could always generate 8k tokens and continue with the “Continue” button. But it doesn’t seem to exist anymore?