This weekend I have been experiencing some issues with ChatGPT which are likely only temporary. However, a somewhat random observation gave me some insight on why sometimes the model looks particularly broken, dumb or forgetful.
The main goal of this post is to highlight this specific failure mode / bug that makes ChatGPT’s responses look “dumb”, “forgetful” and “confused”, but which has a simple, understandable reason which you can go and check.
tl;dr: If ChatGPT is behaving particularly badly and inconsistently, reload the current conversation and check the transcript. You may realize that the transcript is missing or is truncating some messages. If so, the conversation is broken; start a new one. This does not necessarily fix the issue (the new conversation might also break), but at least you know what’s going on.
To be clear, this post doesn’t cover all failure modes, just this one in particular.
The problem pertains the user interface and handling of the context window. Specifically, the ChatGPT interface shows what looks like a full answer, but internally ChatGPT only receives /stores truncated answers or even misses entire bits of conversation.
From the User perspective, it looks like ChatGPT is behaving stupidly, ignoring the context or the User commands, repeating itself, forgetting previous interactions, etc. But ChatGPT’s behavior makes a lot of sense considering what it can actually see in terms of the context it is fed.
Arguably, this issue has nothing to do with the internals of the LLM not working or being tampered with (e.g., attention, number of experts, etc.) or OpenAI intentionally nerfing the model to balance the load, etc. Instead, it seems it’s some “simple” network or database error in saving ChatGPT’s responses, and evidently it is a bug and not intentional.
How do I know what’s going on?
After having some extenuating conversations using a Custom GPT whose behavior I know extremely well (I have been playtesting this extensively), I logged out, logged in again, and went back to check the transcripts.
The transcripts arguably report what ChatGPT actually “saw”, i.e. received as context (as opposed to what was printed on my screen). What the transcript showed is a bunch of truncated or missing answers. For example, some of my answers (which at the time were interleaved with ChatGPT’s responses) are clumped together, entirely missing pieces of the conversation on the GPT’s side.
I paste an example below, from a Custom GPT I am working on (an interactive fiction game).
If you look at this screenshot of the transcript, it looks like I interrupted the GPT while it was listing a bunch of scenarios and picked the “French resistance” setting, which does not appear on the GPT answer.
However, that’s not what happened on my screen at the time of the interaction (unfortunately, I have no screenshot of that). The GPT answer was not truncated at the time of execution: it listed six scenarios (as it should), one of which was the French resistance thing, which is the one I ended up picking.
Also, an entire GPT message is missing from the transcript, the message in which the GPT presented me with multiple characters from the French resistance setting, and my answer after that message was “Lucien”. Notice how my two separate responses are now clumped in a single message which says:
In hindsight, this is what the GPT saw for the purpose of continuing the conversation, as opposed to the full answers which were given to me.
The subsequent conversation has the GPT looking dumb, asking me about the setting or Lucien multiple times, and trying to continue as well as it could having no idea of what I was talking about. Point is, ChatGPT is very good at “winging it”, so it was not immediately clear that e.g. it was missing entire bits of the conversation, but responses just looked slightly odd and incoherent. In short, we had a confused GPT and a very frustrated user (myself).
With the evidence above, I believe the reasons for the GPT behavior in this situation is clear. It’s not really a problem of the model being nerfed or whatever, but just some “trivial” database or network issue, and obviously a bug.
PS: The fact that you see “3/3” in the screenshot above is because I went back and retried multiple times, given the incoherent responses; all threads show the same issue of truncated / missing messages, which I didn’t know at the time as on my screen I could see fully-formed answers, if increasingly confused.
If you feel that ChatGPT is behaving particularly weirdly and inconsistently, reload the current conversation. You may realize that ChatGPT was missing some crucial messages. At that point, your best bet is to start a new conversation (although if the network/database failures are still ongoing, you might experience the same issue again soon).