Hi everyone,
I have a feature request for OpenAI. Please consider adding a flag to the create endpoints that automatically truncates input tokens if they exceed the model’s maximum context length. This will prevent errors and make integration smoother for developers.
My proposal is to introduce a flag in the create endpoints that, when enabled, would automatically truncate the input tokens if they exceed the model’s maximum context length. This would help prevent the occurrence of the InvalidRequestError error and provide a more seamless DX.
By implementing this feature, users would not need to manually truncate their input prompts to fit within the context length limit. The flag would handle the truncation process automatically, ensuring that the conversation remains within the allowed token range. Therefore, we avoid hacky solution like the following:
I would say do what the error message says. Reduce your input prompt by truncating the older messages.
Be proactive at estimating your input prompt and keep it under a certain level. You can do this by the estimate of W = T/sqrt(2), where W is the number of English words and T is the number of tokens.
In your case, T = 1800 (or 1700 for more margin). If you use 1700 tokens, then this is 1200 words. If you count more than 1200 words, then drop the older history until it fits to less than 1200 words.
Source: https://community.openai.com/t/gpt-3-how-to-reset-context-length-after-error/