Character count bug in Japanese outputs — ChatGPT sees "ghosts" we can’t

When I ask ChatGPT to write a 400-character Japanese review — such as for an Amazon listing or a school assignment where 400文字 means exactly one page of 原稿用紙 (20 lines × 20 characters) — the output often ends up far too short.

Visually, the result is only 5–6 lines, which is roughly 250–300 characters at most. But ChatGPT insists, “This is around 400 characters.”

The best analogy I can give is this:
ChatGPT is like a ghost-seeing psychic who’s asked to count five people in a room — and includes the ghosts that only it can see.

So when it says, “There are five people here,” we humans look around and go, “Uh… I see three at best.”

In this case, the “ghosts” are internal tokens or byte-based character counts. But for Japanese users, 400 characters means what we see — full-width kana, kanji, punctuation, and spaces, not some invisible internal units.

Could OpenAI please consider adjusting this? For example:

  • Add an accurate visual character counter for Japanese text
  • Provide a mode for proper 原稿用紙 formatting (20×20 layout)

This isn’t just a minor issue — it affects students, professionals, and writers working in Japanese.

ChatGPT’s Japanese output is incredibly helpful, but this character-count mismatch makes it unreliable in any task with strict limits. Thanks for considering this!

2 Likes

This is not a Japanese language issue. It is because LLM. ChatGPT is not a calculation machine and cannot count characters accurately. You known the famous question “How many ‘r’ in strawberry?”

For example in English:

And with long content it is more and more difficult, because LLM see tokens not words.

https://platform.openai.com/tokenizer

Enjoy my success, demonstrating the prompting technique, instructing the use of ChatGPT’s tools.

1 Like