Feature Request - Improved Handling of Maximum Context Length in Create Endpoints

alberduris · May 25, 2023, 9:41am

Hi everyone,

I have a feature request for OpenAI. Please consider adding a flag to the create endpoints that automatically truncates input tokens if they exceed the model’s maximum context length. This will prevent errors and make integration smoother for developers.

My proposal is to introduce a flag in the create endpoints that, when enabled, would automatically truncate the input tokens if they exceed the model’s maximum context length. This would help prevent the occurrence of the InvalidRequestError error and provide a more seamless DX.

By implementing this feature, users would not need to manually truncate their input prompts to fit within the context length limit. The flag would handle the truncation process automatically, ensuring that the conversation remains within the allowed token range. Therefore, we avoid hacky solution like the following:

I would say do what the error message says. Reduce your input prompt by truncating the older messages.

Be proactive at estimating your input prompt and keep it under a certain level. You can do this by the estimate of W = T/sqrt(2), where W is the number of English words and T is the number of tokens.

In your case, T = 1800 (or 1700 for more margin). If you use 1700 tokens, then this is 1200 words. If you count more than 1200 words, then drop the older history until it fits to less than 1200 words.

Source: https://community.openai.com/t/gpt-3-how-to-reset-context-length-after-error/

Foxalabs · May 25, 2023, 10:28am

The counter to that is to use a token counter such as TikToken prior to sending your message to the API.

alberduris · May 25, 2023, 10:32am

The same problem remains. Performing such calculations shouldn’t be required. You may do so to manage truncation, but for standard truncation (oldest messages first), a flag would suffice.

Ultimately, the goal is to provide a similar experience for API users as is available in the ChatGPT interface (you are not deleting the old messages manually until the chat history fits, right?).

Foxalabs · May 25, 2023, 10:36am

Well you run into other problems when doing it that way, e.g. A novice API user sends text that is too large and is reliant on some section of data from the start that is now truncated away, they are unaware of the Truncated flag and spend a long time trying to debug why some of their message work and some not. Doing it this way firmly places the responsibility for token management on the caller, thus reducing potential confusion.

alberduris · May 25, 2023, 10:58am

The truncated flag would be false by default.

Topic		Replies	Views
Request: Query for a models max tokens API	7	2326	March 27, 2024
Add a token limit attribute on api.openai.com/v1/models API api	5	572	June 15, 2023
Assistant API Max input context size API	5	1027	April 16, 2024
API token limitation differs from website UI token limitation API	4	354	December 18, 2023
Help Needed: Tackling Context Length Limits in OpenAI Models Community gpt-4 , chatgpt , token , rate-limit , openai	8	3214	February 8, 2024

Feature Request - Improved Handling of Maximum Context Length in Create Endpoints

Related Topics