GPT thinks math answers are wrong, even when it says they are right

icjosh · October 21, 2024, 10:52pm

I’m creating a trivia app. For some reason, the bot is having a hard time with math (but not the calculations). Yes, I know LLMs have an issue with math. However, gpt will literally respond like this.

“Oops! 8-4 doesn’t equal 4. The correct answer is 4!”

Does anyone have a suggestion for why this is happening? My best guess is that the zod format schema is “working” but the gpt splits up it’s answer and understanding of the context within the object output itself. If that is the case, is there a workaround?

This is the Schema:

const TriviaFormat = z.object({
			was_answer_correct: z.boolean(),
			fun_fact_or_critique: z.string(),
			next_question_to_ask: z.string()
		});

This is the bot setup:

const botResponse = await openai.beta.chat.completions.parse({
			model: 'gpt-4o-mini',
			messages,
			max_tokens: 2000,
			top_p: 0.125,
			temperature: 0.125,
			response_format: zodResponseFormat(TriviaFormat, 'event')
		});

icjosh · October 27, 2024, 6:27pm

6 days and no reply. Bumping

_j · October 27, 2024, 7:25pm

You have your choice:

quality of AI model
quality of AI model input

You already sabotaged yourself by using gpt-4o-mini, instead of a knowledgeable model such as gpt-4 or gpt-4-turbo.

Using structured outputs would not be my choice for obtaining the highest-quality answer. Especially when the first token you are asking the AI to produce is “true” with no previous analysis of the situation.

Topic		Replies	Views
Chat completion API giving wrong answer API api	24	389	March 18, 2025
Making GPT Assistant answer me just numbers without losing context Prompting gpt-4 , chatgpt , prompt , assistants , assistants-api	3	2065	December 6, 2023
Is GPT designed to cater to humans rather than give correct answers? API gpt-4 , chatgpt , api	1	332	June 29, 2024
ChatGPT4 unable to do simple Math? Prompting gpt-4 , chatgpt	3	3224	February 10, 2024
GPT4o doesn't output anything API gpt-4	2	67	April 30, 2025

GPT thinks math answers are wrong, even when it says they are right

Related topics