Time to talk about GPT-4 limitation

jwatte · November 22, 2023, 11:46pm

I’m not sure these models can correctly talk about beginning and ending letters.
If you think about how tokens work, a token doesn’t have any metadata about what the first letter is, and some tokenization models even start tokens with a space!
I’m actually quite surprised that the poem is as good as it is at adhering to the rules. Something in the training must have told it that those particular tokens all qualify as “begin with m.”

“M”, " moon", " mirror" and … " lakes" ?

It might also be better to tokenize as " moon" “beam” “s” (for English, the suffix “s” is interesting) instead of " moon" “be” “ams”, but this is what it does.

Topic		Replies	Views
GPT-3 can't count syllables - or doesn't "get" haiku Prompting	14	2185	May 18, 2023
GPT-4o - Hallucinating at temp:0 - Unusable in production Feedback api-hallucinations , gpt-4o	26	6114	July 24, 2024
Gpt-4o hallucinates a lot Community api	27	5255	December 2, 2024
Why do the models hallucinate? API	2	828	April 14, 2024
Overtrained transcription model? API	7	107	June 20, 2025

Time to talk about GPT-4 limitation

Related topics