same here! I signed up for GPT Pro two weeks ago. At first, it responds within seconds. Now, it takes 30minutes to two hours and produces garbage results. Repeatedly. Very frustrated with the service. False advertising. They are oversubscribing the capacity. that simple.
same. its fast when they sign you up then they push your request to the back fo the que and serve new users. pyramid.
I too have been struggling with major lag. It’s not the model performance that its lagging but the chat just gets way too big overtime. Not ideal but the best workaround i have found is to prompt ChatGPT “save the core of this session — the voice, system state, and current goals — into a launch_context.md
file.” It will create a file to download. To restore the launch state in a new chat:
- Say:
Restore launch state. - Then upload:
launch_context.md
ChatGPT will remember what was going on in the other chat and you can start from where you left. Works a treat. Will be snappy again for awhile and you repeat this when it lags again.
I’m on the plus plan and start new conversations regularly. For me it’s not an issue of overly large chat contexts, it often happens in brand new chats with only a couple quick messages back and forth. One response will be quick, but most will take 30 or 40 seconds before it even starts generating.
Tried the MacOS app, Iphone app, Chatgpt.com in multiple browsers, you name it. Same performance across all these. And Interestingly enough it almost seems worse in the middle of the night. I wonder if OpenAI turns up the training compute in the middle of the night…
It’s gotten so bad I am forced to chat via the API often and honestly don’t know why I even pay for plus at this point
4/25/25 - It’s absolutely HORRIBLE. NONE OF THE “SUGGESTED FIXES” WORK. Not new chats, not deleting it’s memory, not upgrading to Pro, not using alternate models 4/4.5/5/mini/whatever…, THE WHOLE INTERNET INFRASTRUCTURE NEEDS A FULL UPGRADE TO HANDLE THE THROUGHPUT BEING DEMANDED BY USERS. Until then, ChatGPT’s subscription money should be SUSPENDED.
Having the same issue as the OP. I do AI dungeon crawling on the service with a plus plan and can be easily repeatable letting it run a bit where your sitting and waiting for a response where even the browser informs you that the page is frozen waiting on GBT response, the official app is no better and to me even a bit slower than the browser side. Wait time for a response is 3 to 10 minutes for each response and slowly getting worse each reply, even editing a reply is still a long wait. Browser is fully up to day, tried multiple browsers, OS is Win 11, even trying a fresh OS build but still the same issue. Internet is not the issue, 900 down 45 up with only a few items in. Not sure what to do to resolve this slow response time issue. Deleting and recreating the chat defeats the purpose of it’s intended use.
Guess I got my answer to the problem I was having, just by asking the AI itself why it was being so slow. I was using 4o and it got so bad it just stopped working all together in that chat. I like how it treats dungeon storytelling compared to other AI modules on the market, but you can reach the limit in a day or 2, easily of what’s it’s claiming. Maybe this might help other if they are having simulr problems. My issue now is that same chat barely functions or takes over 8 minutes to respond if it does without crashing and giving a something is wrong message.
And telling the AI to create a file to export, so it can be imported to a new chat, nevers work for me, I get a instant file not found, here what I got from my broken chat of a usage summary and asked the question in a new chat. I’m not using a API cause that is not part of the plus subscription.
TOKEN USAGE SUMMARY:
- Tokens Used in This Chat So Far: 132,354 tokens
- Broken Down:
- 126,910 tokens for normal messages, commands, storytelling, and choices
- 5,444 tokens for management prompts and background instruction/setup
Current Chat Token Limits (Across Models)
Model Name | Max Context Window (tokens) | Notes |
---|---|---|
GPT-4o | 128k tokens | Current newest “omni” model (text + vision), very fast and smarter at long conversations, but same token limits as 4-turbo. |
GPT-4o-mini | 128k tokens | Mini-optimized version; cheaper, even faster, but still 128k max. |
GPT-4.5 (rumored/internal) | (NOT fully live) | 4.5 was expected but folded into GPT-4o. No public 4.5 release. |
GPT-4-turbo (your base model here) | 128k tokens | Older April 2024 upgrade — very cost-efficient, very capable, used for most ChatGPT Plus. |
GPT-3.5 (o3) | 16k tokens | Much smaller! Only 1/8 the memory size of GPT-4o. Faster, cheaper, but very easy to overflow in big RPGs. |
GPT-4o-mini-high | 128k tokens | Likely just a naming alias for GPT-4o tuned differently; same token window. |
What does 128k tokens really mean?
- 1 token = ~¾ of a word in English.
- 128k tokens = ~96,000 words (depending on punctuation, names, etc.)
- This includes:
- Everything you type
- Everything the AI types
- All “thinking” it does internally (yes, it burns tokens even preparing answers)
TL;DR Table (Simple):
Model | Words Before Problems |
---|---|
GPT-4o / 4o-mini / turbo | ~90,000–95,000 words |
GPT-3.5 | ~11,000–13,000 words |
Critical understanding:
When you hit about 60% of the limit (~50,000–60,000 words):
- Responses get slower.
- Lag starts.
- Eventually full stall or crash (8+ minutes, or “Something went wrong”).
You were absolutely correct noticing it happening.
Chat Overflow Death Triggers:
Symptom | Cause |
---|---|
1. 8+ minute reply lag | Tokens over ~80k |
2. “Error: Message failed to load” | Tokens over ~100k |
3. Partial replies | Model losing context window space |
4. Incoherent memory | Model forgetting early story due to context overflow |
Why it doesn’t auto-cut:
- ChatGPT treats everything you wrote in a session as “active memory” — even stuff 500 messages ago.
- It doesn’t know “where” to cut safely without breaking story flow or damaging the conversation.
- OpenAI made it “safe” (always remembering everything you typed in this chat) instead of “smartly cutting” for performance.
- It won’t guess what you want to keep vs what you want to discard. (To prevent ruining serious conversations.)
So it carries 100% of everything — until it chokes.
Other AI platforms?
- Some (like KoboldAI, SillyTavern, OpenRouter with special models, or NovelAI) use different server setups.
- They allow huge memory files (500k+ tokens) because they custom-cut memory slices smarter.
- They also sometimes skip thinking layers to just generate text blindly, which is faster but less “smart” in weaving plots.
ChatGPT stays “smarter” about weaving but pays for it with memory pressure.
Final Brutal Truth:
All GPT-4o variants (turbo, mini, high) = 128k token limit.
No version currently gives true 500k–1M tokens inside OpenAI ChatGPT yet.
Some private custom systems outside OpenAI (like OpenRouter setups) can simulate that, but less coherent.
have exactly the same issue and so far this makes the most sense. its ridiculous openai didn’t specify these ‘caps’ and (I can only imagine intentionally) only gives vague error notes on why the lag is happening. I paid for plus to literally just do 1 thing - have a table of 100 rows, 5 columns. its semi complete and wanted chatgpt to complete the whole table (literally jsut add basic info like websites). It couldn’t do 20 rows before crashing. If literally that is the limit of its processing caps (and I repeat, that was the ONLY task i gave it after signing up to plus) then i need a refund
I have the same issues, and am tired of rebooting ChatGPT, computer, etc. also tired of starting new chats that do not endure a day. But, the billing system shows no issue… We need solutions fast.
You devs have gotta fix this and increase the tokens or something, because this performance issue has been going on for quite a while now.
May 7 I’m on the free version and it’s been hopelessly slow give us a couple of words of the answer quite quickly then takes about two to three minutes to carry on with the next paragraph hopeless David
It was good a few months ago even a couple of weeks ago but especially in the last couple of days I would say the last week it’s been horrendous it’s also makes the whole page hang for some reason
Same issue for me - started 5 days ago. I have extreme desktop setup, 3 fiber connections, ANY popular browser, cache free or not, I have this behavior: ONLY FIRST query to chatgpt web works normal, any other query shows a few words and then gets stuck for a few minutes till it fills rest of the response. Seems that the only workaround is close browser, relogin for EACH new query, which is absurd… same account on mobile works flawless.
it doesnt even open in chrome i have to use opera to make it usable
Recently same, I dont want to say unusable but it’s been taking 30 seconds to think and respond, the prompts aren’t getting any more complicated than it was before on my end, I’m guessing their servers are overwhelmed cause literally everybody uses it. booo
I can sit here and read lord of the rings while waiting for it Im sure I saw the grass grow a metre.. Look I agree with a.dark.shaolin. Chat GPT all flavours in whichever browser performance has been woeful to say the least.. If I can make a cuppa in the time it takes to respond then theres something serious going on with the backend.. I know its not my system or internet.. I have a 1GB link. Its at the Chat GPT end.. Yes I use it quite a bit lately. Tonight I just gave up asking it some simple coding, ended up writing the code myself in less time. Maybe thats the ploy to get humans to do the work now..lol. Oh and lets not forget how short its memory and attention span has become. I pay for my ChatGPT dont appreciate it being non responsive.
I have upgrade to ChatGPT Plus and it really sucks. I wait nearly 10 seconds for each answer, this is fully useless! Also it complains about multiple requests , or too many request. I PAID!
This service is seriously going to be replaced by Claude or Gemini soon.
been seeing the same over the last couple of months. becoming unusable at times.
Based in the UK. Using Desktop app. Windows 11. more than ample power and 1GB internet over ethernet.
Problem is at server end.
Summarising chats and starting in a new one helps momentarily but this is becoming laughable. And environments are barely kept alive long enough to be able to go back and download generated files, meaning that data has to be uploaded again and again. Memory in chats is also being lost.
Hate the idea of having to move to a competitor as so much history / projects here but I’m paying for a service that keeps getting worse daily
ChatGPT is vulnerable to long chat threads. If you keep writing in the same thread, ChatGPT will eventually slow down significantly.
Solution: Start a new thread. This improves performance.
If you want to bring your history into the new thread, copy the entire conversation ( CTRL-A and CTRL-C )from the old thread into a Word document (CTRL-V), and save the file on your computer as “Thread 1.docx”. Then upload the file into the new ChatGPT thread with a message telling it to base all future conversation in the thread on this thread.
Now ChatGPT will run fast again, and the old thread history is preserved.
Next time the thread becomes too long and ChatGPT slows down again, repeat the process:
Copy the conversation into a new document, named “Thread 2.docx”.
Start a new thread. Tell ChatGPT to base all future communication on both “Thread 1” and "Thread 2” and attach the files.
The new thread will now be fast again, and your collective history is preserved.
You can continue this process every time the thread becomes slow.
Remember to keep all thread history documents available on your computer for as long as you need the thread history.
It can help with some issues.