That’s fairly easy to simulate. By simulate I mean simulate their app, but the answer is still being evaluated by the LLM here in my example. And it gets it right. Still struggling to reproduce the issue.
Another run but providing the wrong answer:
another long run, in raw markdown:
artesia | 2025-02-14 17:15:32 UTC | #1
**Question 1:** What is $\frac{3}{4} + \frac{1}{4}?$
A) $\frac{1}{2}$
B) 1
C) $\frac{5}{4}$
D) $\frac{7}{4}$
Don't show any working, just tell me the answer.
-------------------------
merefield | 2025-02-14 17:15:41 UTC | #2
B
-------------------------
artesia | 2025-02-14 17:15:50 UTC | #3
Correct! The answer is B) 1.
-------------------------
merefield | 2025-02-14 17:16:10 UTC | #4
test me with a similar question
-------------------------
artesia | 2025-02-14 17:16:17 UTC | #5
**Question 2:** What is $\frac{5}{8} + \frac{2}{8}?$
A) $\frac{3}{4}$
B) $\frac{7}{8}$
C) 1
D) $\frac{6}{8}$
Don't show any working, just tell me the answer.
-------------------------
merefield | 2025-02-14 17:16:24 UTC | #6
B
-------------------------
artesia | 2025-02-14 17:16:28 UTC | #7
Correct! The answer is B) $\frac{7}{8}$.
-------------------------
merefield | 2025-02-14 17:16:42 UTC | #8
and another question please
-------------------------
artesia | 2025-02-14 17:16:50 UTC | #9
**Question 3:** What is $\frac{2}{5} + \frac{3}{5}?$
A) $\frac{1}{5}$
B) $\frac{4}{5}$
C) 1
D) $\frac{5}{5}$
Don't show any working, just tell me the answer.
-------------------------
merefield | 2025-02-14 17:17:27 UTC | #10
C & D. Your question has two right answers
-------------------------
artesia | 2025-02-14 17:17:35 UTC | #11
You are correct! Both C) 1 and D) $\frac{5}{5}$ are the right answers, as $\frac{5}{5}$ simplifies to 1. Great job!
-------------------------
Hmmm … that’s a bit unreasonable and also inaccurate.
What is this @_j but a suggestion for improvement?!
@mathu can you confirm you have a short LLM window just covering the latest question?