I encountered a rare but noticeable bug in ChatGPT where a random Chinese character appeared in the middle of a generated Russian word.
Bug Example (Original Output):
“Один раз 撒авши, кто тебе поверит?”
Expected Output:
“Один раз солгавши, кто тебе поверит?”
Context:
This happened when ChatGPT attempted to generate a well-known Russian quote: “Единожды солгавши, кто тебе поверит?” Instead of the correct word “солгавши”, it inserted a Chinese character 撒 followed by the correct ending “авши”.
Possible Cause:
It might be related to how the model processes partially recognized words or substitutions when reconstructing quotes. The error suggests an unexpected token encoding issue.
Steps to Reproduce:
- Ask ChatGPT to generate the Russian phrase “Единожды солгавши, кто тебе поверит?”
- Observe if any unexpected characters appear in the output.
This bug may be rare, but it affects text integrity and could confuse users. Hope this helps improve the model.
Thank you!