GPT-4o context window confusion

eneuberg · May 17, 2024, 10:07pm

According to the docs, gpt-4o has a context window of 128.000. i put “max_tokens=64000” as parameter. why am i getting the error “openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: ‘max_tokens is too large: 64000. This model supports at most 4096 completion tokens, whereas you provided 64000.’, ‘type’: None, ‘param’: ‘max_tokens’, ‘code’: None}}”? where does the number 4096 come from? i didnt find other related posts to be very helpful

ramirezja16 · May 17, 2024, 10:34pm

I think you might be confused by what the max_tokens parameter actually does.

It isn’t used to set the context window of the model, it’s used to set the limit to how many tokens the model will output at one time.

The limit to that is 4096. I believe that’s the case for just about every model out there right now, not just OpenAI’s.

There is no parameter to adjust the context window of the model itself. That’s always set at 128k and it’s up to you to manage that on your own, unless you use assistants.

anon35879086 · July 8, 2024, 10:51am

To add some colour here, the idea is that a long context model is good at processing 100k tokens in its inputs, but only generates about 4k output tokens at a time.

For example, long context models can read a short story (of say 50k tokens), and answer a question about the short story (< 4k tokens).

Although I’m sure this will change in the long-run, right now 100k output tokens will probably devolve into chaos and there are fewer use cases for this.

aanand300 · August 3, 2024, 10:51pm

Do assistants provide longer context window than 128k tokens or do they perform some sort of pre-processing of larger files in the backend and “reduce” uploaded files and instructions to <= 128k tokens? I’m confused because the official docs still says that model gpt-4o has 128k context window. Am i missing anything where they officially talk about longer context windows?

Foxalabs · August 3, 2024, 11:08pm

Hi, the context window of the underlying model is 128k, RAG pipelines and other methods used by microsofts AI search system are used to choose which data is included in the prompt up to that maximum 128k.

aanand300 · August 4, 2024, 12:30am

I see --yes that makes sense. Thanks!

Topic		Replies	Views
GPT-4o Context Window is 128K but Getting error model's maximum context length is 8192 tokens, however you requested 21026 tokens API	9	9372	October 21, 2024
Number of context windows for GPT4V Community gpt-4	2	1884	January 26, 2024
Only allowed to set max_tokens to 4095 API	4	641	May 17, 2024
What is the context window of gpt 4 API gpt-4 , chatgpt , api	4	14564	June 12, 2024
Max tokens chat completion gpt4o API gpt-4o	4	18864	September 5, 2024

GPT-4o context window confusion

Related topics