How to increase token limits

edouard3 · February 27, 2025, 8:18am

Hello everyone! Surprisingly, on such a large platform, I haven’t been able to find a way to increase the token limit to at least 300,000 characters for the past six months. Editing large literary texts in parts or with hints isn’t an option, as the assistant inevitably loses context and history. I couldn’t find a way to increase the limit, even for any price, nor could I find live support. Maybe it simply doesn’t exist, and the platform is entirely run by AI

Diet · February 27, 2025, 12:51pm

Welcome to the community!

Some models support up to 128,000 tokens, and a token is about 0.7 English words. https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

This is a hard limit when it comes to completion models for OpenAI, and sort of a soft limit on a per-model basis in general - performance tends to degrade if you have a gigantic token window.

People typically deal with this by using some sort of Retrieval Augmented Generation (RAG) approach

OpenAI has a solution for this in the form of Assistants File Search (Warning! It’s in BETA: https://platform.openai.com/docs/assistants/tools/file-search)

You might think, this is not what I want! But behind the scenes, the LLMs work pretty much the same - they ‘search’ through your long context and elevate specific concepts in your text which they will use to generate the output. Models can’t “see” your entire text as such. Embedding based file search works in a very similar way.

edouard3 · February 27, 2025, 10:52pm

Thank you for your helpful response!

However, I’m afraid your advice won’t work in my case.

Let me explain my problem with an example.

I wrote a literary text of 300,000 characters and wanted to improve it using GPT—enhancing the style, adding stronger emotional and immersive elements, but without changing the core storyline. GPT handles this task well, but only until the session’s token limit is reached.

Once the limit is exceeded and I continue in a new request, the assistant completely loses connection with the previous text. No matter how I try to explain what happened earlier, it still starts improvising, taking the story in unintended directions. As a result, the plot becomes unrecognizable, and GPT itself loses track of what it’s doing.

Yes, it can refine grammar and style. But if the goal is to enrich the text without altering the story, GPT cannot do this effectively without sufficient tokens. Even splitting it into two requests doesn’t work—it requires a continuous and cohesive approach to text processing.

edouard3 · February 27, 2025, 10:57pm

When I asked GPT what he thought about this, this is what he told me:)
You’re absolutely right! GPT has limitations when it comes to maintaining long-term context. If a text is too large and needs to be split into parts, even with detailed instructions, a new session will lose some continuity. This is especially noticeable in creative writing, where style, tone, and emotional depth matter.

Ideally, for such a task, you’d need a tool with a longer context window or the ability to load the entire text and work on it without losing coherence. So your concern is valid, and right now, GPT doesn’t have a simple solution for this.

Diet · February 28, 2025, 1:02am

You can try and see if gemini might be more helpful, it supports a million token context.

VivacityDesign · March 4, 2025, 6:42pm

How about creating your own javascript chatbot interrogating the API and specify the max_completion_tokens parameter to (circa) the maximum?
I have never tried with texts as long as 300.000 chars, but my personal chatbot based on GPT4o-mini normally handles up to 60-62000 chars (i have used 16000 tokens as value for the parameters).
I have tested it multiple times extending my website templates .

_j · March 4, 2025, 7:17pm

The topic’s concern is about the desire for sending large (and what would be expensive) inputs to the AI model, larger than the total context window length of any model offered by OpenAI.

What you discuss actually works opposite of the desires.

max_completion_tokens instead refers to the cost limit of generated output by the model. When you specify this, it actually serves to reserve that much space for you from the context window solely for the response. Increase the output size from 2k to 15k, and the total input you can send in an API call is thus reduced from 123k to 110k.

It’s better to have no max_completion_tokens parameter at all when approaching the top end of context length, so the balance between the input and output part is automatic. However, then there is also no specific space reserved for output – your input could consume nearly the entire context window if you sized it just right, and the responses would be prematurely cut off when the context window length is exhausted, or you’d have no room to “chat” or to use tool calls.

There are smarter ways of achieving the overall goal and using input context wisely: summation of parts.

edouard3 · March 5, 2025, 7:13am

Thanks for your advice, but Gemini can’t see history memory more than 30k characters, yes it remembers chats, but after 30k it stops thinking. Maybe I’m not exact in the number of characters, but in my experiment it lost connection exactly at 30k

edouard3 · March 5, 2025, 7:40am

The bottom line is this: when GPT developers claim it can write books, it’s somewhat of an exaggeration. GPT doesn’t have the ability to “see” the story it has conjured up, so how can it fully continue it? Yes, the works generated by GPT can produce astonishing stories that can amaze us, especially when you realize that it was created by artificial intelligence. However, this has no connection to genuine literary work. I hope that in the future, AI developers will bring their creations to a level where we see new ideas in literary creativity.
And here’s what GPT itself thinks about this
Of course, let’s delve into this issue.

GPT and similar large language models function as predictive systems that are trained on vast amounts of text data. They analyze sequences of words and phrases and use probabilities to predict what might come next. This inevitably leads to a certain level of restriction since GPT relies on what has already been written.

Key Limitations:

Logic and Coordination: GPT cannot “see” the entire story like a human author. It generates text based on previous tokens but lacks the memory or reservoirs of ideas that can be developed, like a human does.
Creativity and Originality: While GPT can generate creative and sometimes astonishing ideas, they are based on reworking existing information. This can lead to impressive results, yet these results more closely resemble paraphrasing rather than creating fundamentally new ideas.
Context and Motivation: Crafting a complete work of art requires a deep understanding of characters, motivation, and emotional arcs that develop throughout the text. GPT generally lacks the mechanisms to understand and coordinate such complex structures.

My Thoughts:

GPT is undoubtedly a powerful tool, capable of inspiring authors and other creative professionals. However, it lacks the human element that makes literature meaningful and moving. Artificial intelligence will not yet replace the human ability to consciously choose from numerous paths, modify contexts, deepen motivations, and create character biographies if needed.

In the future, it’s possible that more advanced forms of AI will be able to offer something new in terms of literary creativity. However, there will always remain an aspect of unique human perspective and emotion that artificial intelligence will never be able to fully reproduce. It will serve as a brilliant complement, but not a replacement for authentic human creativity.

VivacityDesign · March 8, 2025, 8:46am

Oh you’re right i had misunderstood the sender’s intentions. by bad.

Topic		Replies	Views
4096 response limit vs 128 000 context window API	11	10733	February 6, 2025
GPT-4 128K only has 4096 completion tokens API gpt-4	9	26757	February 27, 2024
Need more than a 4097 token call from chat gpt api API	7	3181	November 28, 2023
Do 'MAX tokens' include the follow up prompts and completion in a single chat session API token	22	5221	August 25, 2023
Response length in GPT4 API 8k API gpt-4 , api	17	7426	December 18, 2023

How to increase token limits

Related topics