How to use the 128k context using api

I want to use the 128k context. my code get a text then transform it then add the original text and the transformed text with an addition text each loop to transform it to make the context consistent. But I get this error: the model’s maximum context length is 8192 tokens. However, your messages resulted in 10366 tokens. Please reduce the length of the messages. I check the documentation but do not find anything about how to use the 128 context.

Welcome to the community!

What model are you using?

## GPT-4 Turbo and GPT-4

GPT-4 is a large multimodal model (accepting text or image inputs and outputting text) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities. GPT-4 is available in the OpenAI API to paying customers. Like gpt-3.5-turbo, GPT-4 is optimized for chat but works well for traditional completions tasks using the Chat Completions API. Learn how to use GPT-4 in our text generation guide.

Model Description Context window Max output tokens Training data
gpt-4-turbo The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Currently points to gpt-4-turbo-2024-04-09. 128,000 tokens 4,096 tokens Up to Dec 2023
gpt-4-turbo-2024-04-09 GPT-4 Turbo with Vision model. Vision requests can now use JSON mode and function calling. gpt-4-turbo currently points to this version. 128,000 tokens 4,096 tokens Up to Dec 2023
gpt-4-turbo-preview GPT-4 Turbo preview model. Currently points to gpt-4-0125-preview. 128,000 tokens 4,096 tokens Up to Dec 2023
gpt-4-0125-preview GPT-4 Turbo preview model intended to reduce cases of “laziness” where the model doesn’t complete a task. Learn more. 128,000 tokens 4,096 tokens Up to Dec 2023
gpt-4-1106-preview GPT-4 Turbo preview model featuring improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. This is a preview model. Learn more. 128,000 tokens 4,096 tokens Up to Apr 2023
gpt-4 Currently points to gpt-4-0613. See continuous model upgrades. 8,192 tokens 8,192 tokens Up to Sep 2021
gpt-4-0613 Snapshot of gpt-4 from June 13th 2023 with improved function calling support. 8,192 tokens 8,192 tokens Up to Sep 2021
gpt-4-0314 Legacy

Snapshot of gpt-4 from March 14th 2023.|8,192 tokens|8,192 tokens|Up to Sep 2021|

For many basic tasks, the difference between GPT-4 and GPT-3.5 models is not significant. However, in more complex reasoning situations, GPT-4 is much more capable than any of our previous models.

https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4

I use gpt4o. it has context window up to 128k