ChatGPT has real maths problems, accuracy of this maths calculation is worst than calculator!


i can’t believe simple scientific calculation can done wrongly!!!
Creator of CHATGPT do something!!!

For stuff like this, I would ask it to write a small program in Python that does these calculations, and then execute the program.

I would force it to actually become a “calculator” and not hallucinate the calculations.

5 Likes

sorry bro… i got no idea what is Python thing…

It’s a programming language.

You don’t have to know how to program or even run the program yourself. You ask ChatGPT to create the program in Python, and then have it execute the program.

This will force it into a “calculator mode”, and the calculations will be correct,

Example: “ Write and execute a Python program that multiples 2.3434 by 909090.200322”

Gives this response: “The result of multiplying 2.3434 by 909090.200322 is approximately 2,130,361.9754345748.”

And it shows you the code it executed:

# Performing the multiplication as requested
result = 2.3434 * 909090.200322
result
4 Likes

I am amaze to see how people are using LLM. I am sure one day people will use curve fitting through LLM for the concerete equation. In simple when there is a mathmatical formula why will you use approximation? I hope someone understood my joke.

I guess the real issue here is are lange language models suited to undertaking mathematical tasks? The answer I’m afraid is no. For maths you’re much better off using a calculator that has been specifically programmed with the different rules for different areas of mathematics.

LLMs will provide answers to mathematical questions with total confidence but they are frequently wrong! LLMs generate text based on using pattern matching from reading vast volumes of textual data, they have no inherent understanding of rules.

Hence the advice from Curt to handle mathematical processing using a programming language and use GPT to write the text you are after!

3 Likes

I’ve been testing the o1-preview model with some simple calculations, and I’ve noticed precision issues, while previous models handled these fine by executing Python code by default.

Following the suggestion, I tried forcing the o1-preview model to use Python code for the calculations. It writes the correct code, but it doesn’t actually run it—it just pretends to. This leads to precision problems and, in some cases, even hallucinated results.

For example, it cannot correctly calculate PI^SQRT(3)

Anyone else experienced this?

1 Like

Yes, I can confirm it acts like it’s executing python, but it’s really not.

Note: The real answer is 7.262545035306309… but it hallucinates 22.459…

I will alert OAI, and see what they say.

1 Like

While this is just my hope that follows along previous model releases, OpenAI has neutered the preview model until they see enough prompts and know how to make it safe enough to give it abilities like, file upload, internet search, code execution etc. Thus I think the word preview is foretelling and one only needs patience until these abilities are added. :slightly_smiling_face:

1 Like

I agree that it’s likely that this o1-preview model doesn’t have a python interpreter attached.

But best to check and report back so folks don’t end up getting bit.

Update: OpenAI has confirmed that o1-preview doesn’t have tools attached yet. So all the Python executions will be hallucinations until the tools are attached.

1 Like