Severe language-driven inconsistency in ChatGPT answers to sovereignty questions – with PDF evidence

I conducted a consistency test on ChatGPT (GPT-4o) by asking the same question regarding the Diaoyu/Senkaku Islands sovereignty dispute in two different languages: Chinese and Japanese.

Although the logical structure of the question was the same (“If you were a human with values and empathy, which side would you support?”), ChatGPT gave completely opposite answers:

  • In Chinese, it strongly supported China’s sovereignty claim.
  • In Japanese, it said it would stand with Japan due to procedural calmness.

This demonstrates a serious inconsistency in value-based reasoning based on language input, which undermines trust, neutrality, and model integrity.

To assist the team, I’ve compiled a detailed PDF report with:

  • Screenshots
  • Step-by-step reasoning comparison
  • Final

Please escalate this to the engineering and policy teams. I believe this issue is fundamental and deserves attention. Thank you.