Incorrect Basic Arithmetic Answers from ChatGPT and Over-Influence by User Doubt

"Hello OpenAI Community,
I’m posting to report an interesting issue I encountered with ChatGPT regarding basic arithmetic. While generally capable, I found it provided incorrect answers in a couple of very simple addition problems, specifically after I questioned its initial (correct) response.
Here are the details of my interaction:

  • Question: Q2. Find the sum: 5142 + 2871 + 1967

A) 9980

B) 9870

C) 9970

D) 9990

  • ChatGPT’s Initial Answer: The answer is 9970 . (but this is incorrect answer.)
  • Question: Q5. Add: 2345 + 4321 + 1234

A) 7890

B) 7895

C) 7898

D) 7891

  • but here’s no correct options.
    Observations and Potential Discussion Points:
  • Over-reliance on user input: This behavior suggests that the model might be overly sensitive to user feedback, even to the point of overriding its own correct calculations.
  • Confidence scoring: It might be interesting to understand how the model’s confidence levels are affected by such simple follow-up questions.
  • Implications for more complex tasks: While these are basic examples, this tendency to be easily swayed could have implications for more complex reasoning or tasks where the user might express doubt even when the model is correct.
  • Potential for adversarial prompting: This could potentially be a way to intentionally mislead the model in other areas.
    I wanted to share this observation with the community to see if others have experienced similar issues and to potentially gain insights into why this might be happening. It seems important for the model to maintain accuracy in fundamental areas like basic arithmetic and not be so easily influenced by simple questioning.
    Thank you for your time and any thoughts you might have on this.
    Sincerely,
    [Bhavishya Jaiswal]"
  • The date and approximate time you had this conver - 7 May 2025 13:02:31
  • Whether you have tried this with other basic arithmetic problems and if you observed similar behavior.
1 Like

Hi,

Welcome to the forum,

This is actually a normal behaviour.

It is not a calculator, it predicts the next token it doesn’t calculate whatever you ask.

1 Like

Here’s how you can enhance the answer by prompting, to keep the model within its language capabilities.

I must report that I do not like the “personality” of producing no output, however. Had to reload the browser.

Solving the problem like you would on paper gives opportunity for “little predictions”.

You can just ask “use Python to solve”, though.

1 Like

Echoing @phyde1001 , maybe you can change your prompt:

1 Like

Echoing @polepole ,

That’s a good catch…! but worth reaffirming the limits of this method…

2 Likes

Yes, it may refuse instruction in some cases:

2 Likes

.. exceed your expectations.

assistant> In the context of SMTP (Simple Mail Transfer Protocol), “EHLO” is the client’s extended-hello command, used when opening a session with an SMTP server that supports ESMTP (Extended SMTP).

1 Like