GPT-4 cannot count and is not able to check its own results

GPT-4 cannot count and is not able to check its own results.
This raises questions about how GPT can produce more complex program codes.

I tried to create a list of instructions to generate a symmetrical poem, such as having 4x4 cells, with exactly 4 words in each cell.
Each (many) time, GPT gives out 5 words per cell, and this is after the supposed self-check…
I now understand where the frustrating trial-and-error cycles sometimes come from, when using GPT for coding.

(Originally written in German, but I think you can count the words.)

  • Zelle 1: “Herz brennt still im Schmerz” → 4 Wörter (Korrekt)
  • Zelle 2: “Seele sucht Trost im Raum” → 4 Wörter (Korrekt)
  • Zelle 3: “Liebe wächst tief im Traum” → 4 Wörter (Korrekt)
  • Zelle 4: “Träne fällt sanft auf Grund” → 4 Wörter (Korrekt)
  • Zelle 5: “Seele weint leis in Nacht” → 4 Wörter (Korrekt)

If we ask directly, it makes mistakes.
But if we provide a well prompt it works better.

Here is the constructor, if you are interested:
I all ways get 5 words per Cell, and GPT even made a wrong self check.
It feels like a 5-year-old is lying, to not get trouble. :smile:

Summary
**Compose a Table Poem:**

* **Theme of the Poem**: Love and Heartache.
* **Structure**: A poem in a table with 4 rows and 4 columns.
* **Word Count**: Exactly 4 words must be used in each cell/sentence.
  * For 5 words per cell/sentence: {First_Noun Action Adjective Conjunction Last_Noun} (e.g. Heart burns softly through stillness)
  * For 4 words per cell/sentence: {First_Noun Adjective Conjunction Last_Noun} (e.g. Heart soft and still)
  * For 3 words per cell/sentence: {First_Noun Adjective Last_Noun} (e.g. Heart weeps still)
  * For 2 words per cell/sentence: {Noun Adjective} (e.g. Heart alone)
* **A Cell or Sentence**: A sentence is the text content of a cell, with the given number of words per cell/sentence.
* **Meaningful Sentences**: Each cell must contain a meaningful sentence related to the theme of the **Theme of the Poem**.
* **Unique Texts**: Each row and each column must contain unique text, and repetitions should be avoided.
* **Nouns**: The first and last word in each cell must be a singular noun (e.g., "Heart", "Pain"). Each noun should, if possible, only be used once.
* **Connecting Words**: Fill in the connecting words in each cell to create a poetic and meaningful sentence.
* **Rhyme Structure**: If possible, try to ensure that the last words of the rows and/or columns rhyme to support the rhythmic flow. Place rhyming final nouns in the same row or column.
* **Poetic Language**: Use poetic and descriptive language to make the theme emotional and graceful. For example, formulations like "Erde" to "Erd" or "Erdenrund" (German) are allowed, especially if they help to create rhymes.
* **Procedure**:
  1. For the poem, you need to find a total of x*y*2 nouns. The nouns should be unique throughout the text, meaning they should not be used as either the first or the last noun more than once.
  2. First, create a list of unique nouns that match the theme and should be placed at the end of each sentence. In total, there should be x*y unique words.
  3. Poetically modify these nouns so that they rhyme as much as possible.
  4. Then sort these words so that they rhyme within a row or column, or both (e.g., Heart and Pain in the same row or column, often staggered in poems).
  5. Then add the appropriate rest to complete the sentence. This means the appropriate unique first noun in a sentence, and the appropriate connecting words. The first noun should also not be used as the last noun, so there will be x*y unique words (in total now x*y*2 unique nouns).
* **Tab-separated**: The cells must be separated by tabs.
* **Output**: The output must be in a plain text code field so that the text can be easily copied.

In the instruction you are using:

Word Count: Exactly 4 words must be used in each cell/sentence.

But you have defined different structures for sentences with 5, 4, 3, and 2 words.

This can confuse the model.

Word Count: Exactly 4 words must be used in each cell/sentence.
It could even have 5 or X amount.

I actually implemented them after the error occurred.
It should use the {construction depending on whether you say 5 words or 4 words}.
You can delete this part, it still doesn’t work. I put it there to try fixing the issue.

YES, you are correct. It does not count words correctly.
Reinforcing with prompts, it gives correct reply mostly when we start with a fresh chat.
But if we ask more in same chat, it starts making mistakes.

I tested with GPT 4o and with a TEST purpose custom GPT, as you know custom GPTs work on GPT 4o.

1 Like

Thanks so much for your analytical work!
You arrived at the same results as I did.

You actually get some cells with 4 words, in my case all cells had 5 words. It was like it had remembered the advice’s from before.

(For me, the question arose of how GPT can program if it cannot correctly recognize its own structures. For example, counting its own words isn’t difficult…
I now have an idea where the trial-and-error cycles I experienced with relatively simple code work, might have come from. GPT usually knows the correct answer, but it keeps making the same mistakes over and over again and fails to recognize obvious errors in the code. It seems that the current AI version still lacks self-reflection and analytical capabilities. It was interesting to observe this while working with the poetry generator.)