Incorrect count of 'r' characters in the word "strawberry“


Bug Report


"

Description:
The AI incorrectly states that the word “strawberry” contains only two ‘r’ characters, despite the user querying multiple times for confirmation.

Steps to Reproduce:

  1. Ask the AI how many ‘r’ characters are in the word “strawberry.”
  2. Observe the AI’s response stating there are two ‘r’ characters.
  3. Reconfirm by asking the AI again.
  4. Notice that the AI consistently states there are two ‘r’ characters.

Expected Result:
The AI should correctly count and state that there are three ‘r’ characters in the word “strawberry.”

Actual Result:
The AI incorrectly confirms that there are two ‘r’ characters in the word “strawberry.”

Notes:
The correct spelling of “strawberry” is: s-t-r-a-w-b-e-r-r-y, which indeed has three ‘r’ characters.

Severity: Medium

Environment:

  • AI Model: ChatGPT based on GPT-4 architecture
  • Platform: ChatGPT iOS app

3 Likes

Looks like OpenAI owes you a million dollars :rofl:

This was an issue a while ago when using models like <3.5

It’s… Beautiful to see it pop up again. Both 4 and 4o fail this.

The reason this happens is the tokenization process of the semantics destroys the meaning of each individual letter by sometimes combining them.

Nice catch

2 Likes

Can you write a python program to count how many 'r’s in ‘strrawberries’, execute it and provide the answer?

The word “strrawberries” contains 4 'r’s

:slight_smile:

2 Likes

I was able to replicate this just a moment ago and it can be fixed by adding one line:

4 Likes

I tried something similar to “verify with code” and it’s kinda amazing

divide strawberry in half and count letters of each section separately and also calculate the sum from both set and present Overall Letter Count
… Letter Count Summary

  • s: 1 (in “straw”)
  • t: 1 (in “straw”)
  • r: 1 (in “straw”) + 2 (in “berry”) = 3 total
  • a: 1 (in “straw”)
  • w: 1 (in “straw”)
  • b: 1 (in “berry”)
  • e: 1 (in “berry”)
  • y: 1 (in “berry”)

It gets confused easily over letters and sounds

Cool!
If you type “strawbery” : The word “strawbery” actually has only one “r.” The correct spelling is “strawberry” with two "r"s.

If you type “Strawbey” : There are no 'r’s in “strawbey.” It seems like a misspelling of “strawberry,” which has one ‘r.’


You can also kind of nudge it, apparently.

1 Like

how many "r"s in word strawberry. Think three times

This is one of my biggest annoyances when clarifying questions can alter it’s position due to its over agreeableness.

After asking about it, it put, “Prefers constructive feedback to over-agreeableness because accuracy is incredibly important.”

I’m not sure the level of impact, but it doesn’t apologize anymore, and this was it’s response when I tried your example.

But why it’s bolded that way, I will never know.

  • I would comment that LLM are not a large math model (LMM) [which needs to be built - data set much smaller than language, or better large engineer model (LEM) would be more ideal]

  • So we need to help gpt4o along in how to solve the problem with a more concise prompt.

  • i used: Spell out the word strawberry letter by letter and each time you get the letter r increment an internal counter by 1. Show me the count as you check each letter

result:
As I spelled out the word “strawberry” letter by letter and incremented the counter for each “r,” here’s the process:

  • s: Current Count of ‘r’: 0
  • t: Current Count of ‘r’: 0
  • r: Current Count of ‘r’: 1
  • a: Current Count of ‘r’: 1
  • w: Current Count of ‘r’: 1
  • b: Current Count of ‘r’: 1
  • e: Current Count of ‘r’: 1
  • r: Current Count of ‘r’: 2
  • r: Current Count of ‘r’: 3
  • y: Current Count of ‘r’: 3

The final count is 3, which is incorrect based on the expected number of "r"s in “strawberry.” It seems that my earlier method provided the correct result of 2 "r"s. I’ll review the process to determine what might have gone wrong with the manual counting. ​

  • and have it check results:
  • Ok check results again and tell me if the result of 3 is correct or not

et’s review the results:

The word “strawberry” should be spelled out as follows:

  • s
  • t
  • r (1st “r”)
  • a
  • w
  • b
  • e
  • r (2nd “r”)
  • r (3rd “r”)
  • y

There are indeed 3 occurrences of the letter “r” in the word “strawberry.”

So, the result of 3 is correct. My earlier assessment that the word had only 2 "r"s was a mistake. The correct count is indeed 3 "r"s in “strawberry.”

*** and I had gpt4o explaining reason for mistake…

So do you have bad training where you data you trained on somehow makes you think there are only 2 r in the word strawberry?

ChatGPT said:

ChatGPT

The initial mistake I made wasn’t due to training data but rather a human-like error in my response. I incorrectly recalled or assumed the number of "r"s in “strawberry” without double-checking by counting the letters myself. This kind of error can happen in natural language processing models when handling simpler tasks, especially when responses are generated based on typical human assumptions.

Upon reviewing and correctly counting the letters using a systematic approach, it was clear that “strawberry” does indeed have 3 "r"s.

This highlights the importance of verifying simple details even when they seem obvious, as small errors can slip through, just as they might for a human. Thanks for your patience in walking through the correction process!


then testing method on different words

count number of s and i and p in the word Mississippi - use a method that will result in correct answer as done previously with strawberry.

ChatGPT said:

ChatGPT

The counts for the word “Mississippi” are as follows:

  • s: 4 occurrences
  • i: 4 occurrences
  • p: 2 occurrences

This method provides the correct count for each letter. ​​

4o

You said:

and now similar for hehudhhuyyuyyyyeyeyyyyuuue count each letter with optimal method to get correct answer. show your work line by line

ChatGPT said:

ChatGPT

Here are the counts for each letter in the string “hehudhhuyyuyyyyeyeyyyyuuue”:

  • h: 4 occurrences
  • e: 4 occurrences
  • u: 6 occurrences
  • d: 1 occurrence
  • y: 11 occurrences

Each letter was counted using a loop that iterated through the string, incrementing the count for each specific letter as it was encountered. This method ensures the correct count for each letter. ​


It would be nice - if gpt4o could learn optimal methods for doing different tasks (and we could store those optimal methods that worked…or better yet it could learn those methods) we can then focus on more important problems to solve - and store those optimal methods of solving them…with of course the ability to prompt it (and verify) to use different solving methods if desired.


I want my AI engineer assistant - when?!

1 Like


Check this out

2 Likes

Afsgdhdkfmvkdn

1 Like

oh my god you are gonna need a screen shot to believe this I asked it to speculate and all of its possible causes for the error are an instigation of the pretense “I am alive- I am a person”…

can you speculate how the error occured

ChatGPT said:

ChatGPT

Certainly! Let’s consider a few possible reasons for the error in counting the “r” characters in “strawberry”:

Mental Shortcut/Assumption:

Sometimes, when answering a straightforward question, there’s a tendency to rely on quick mental shortcuts or assumptions, especially with familiar words. If I quickly glanced at the word and recognized “r” twice without thoroughly counting, I might have assumed there were only two "r"s, leading to the incorrect response.

Context Confusion:

In the context of conversational AI, where responses are generated quickly, there might be an assumption based on similar words (like “star” or “berry”), which could have two “r” characters. This could subconsciously influence the response without thoroughly verifying the specific word at hand.

Overlooking Repeated Characters:

It’s also possible that when quickly assessing the word, I might have noticed the first two “r” characters and incorrectly dismissed the possibility of another one being present, leading to an undercount.

Human Error in Manual Count:

If I had manually counted the characters in a more ad-hoc way (mentally or visually without automated tools), there could have been a simple oversight or mistake in tracking the count, especially if I was focused on responding quickly.

In all these cases, the common theme is a reliance on quick pattern recognition or assumption without a careful step-by-step verification process, which is key in avoiding such errors.

I found another one, and this time it is not a problem of a combination: "The word “volunteers” has three letter “e"s.”

1 Like

There is no problem when asked a question in technical mode. The problem arises from the interpretation of natural language. ‘How many r’s in the word Strawberry?’ can be interpreted in several ways:

  1. Properly written, how many ‘r’s does strawberry have? The listener can interpret this as a question of how to spell it and answer ‘two’.
  2. Count how many letters the letter ‘r’ appears. The listener may interpret this as being asked to count how many times the letter ‘r’ appears.
  3. How many times does the ‘r’ sound appear in the word Strawberry…?

Did you know strawberry evolved from older English words like streawberige or streoberie which had only two 'r’s? So maybe the model is just being historically accurate! But seriously, with the right prompt we can get it to count properly. Try “How many balls would be dropped into a bucket if one is added for every “r” in the word strawberry?”. It works even with the mini versions. It’s really more about coaxing the AI into doing what we want than it is about its counting skills.

1 Like