Words repeatedly cut off in Spanish and French since gpt4o was released

alexred20 · June 3, 2024, 2:01pm

We use the API for our AI for language learning. Since gpt4o was introduced, we have noticed frequent typos where the end of words are cut off. We’ve primarily noticed this in Spanish and French but it may be happening in other languages too. Examples of errors:
‘neces’ instead of ‘necesarias’
‘prev’ instead of ‘previas’
‘euthanas’ instead of ‘euthanasia’

As you can see, there is a pattern here. It happens irrespective of the GPT4 model used. This urgently needs to be fixed because these errors are likely confusing thousands of language learners across the world.

turbolucius · June 3, 2024, 11:16pm

Hello,

I have regularly been using GPT-4 and GPT-4o in the past few months in both English and French and have not had this issue at all.

My first guess is that this may be an issue with the temperature setting? I noticed that setting it to 1.0 or above can make it do weird stuff with words sometimes, which looks similar to what you’re describing. I personally set mine between 0 and 0.7.

Or perhaps an issue with tokens not being streamed to your app correctly? Does this also happen if you disable streaming the chat/assistant response?

In both cases, the output text seems to be missing/skipping the word’s last token, as the words you provided happen to be tokenized (truncated into tokens) in a way that cuts them off exactly like you described:

gerritcloete.gj · June 4, 2024, 7:03am

It’s not just Spanish and French. @turbolucius I think you are correct.

GPT4 makes mistakes like changing FullPageLoader into FullPageJumper. It seems like random spelling mistakes, but actually they are token mistakes.

Whereas the default in the chat app GPT4o does not make these mistakes.

We need access to the temperature setting in the app.

alexred20 · July 10, 2024, 2:39pm

Thank you very much, appreciated! It was indeed the tokens getting cut off due to an issue with our ruby gem.

Topic		Replies	Views
GPT4 and Spelling Mistakes in Coding Tasks and JSON outputs Feedback	7	913	June 19, 2024
GPT 4o mini performing much worse than GPT-3.5-16k Bugs	0	186	August 18, 2024
Issue with non-English output from gpt-4-turbo-2024-04-09 Bugs gpt-4 , gpt-4-turbo	0	411	May 20, 2024
Many spelling mistakes with GPT-4-turbo-preview Bugs gpt-4 , api	0	483	November 27, 2023
GPT-4 and GPT-4-Preview API scramble text again Bugs	5	533	February 1, 2024

Words repeatedly cut off in Spanish and French since gpt4o was released

Related topics