Noticably worse quality on gpt-3.5-turbo-0613 vs. gpt-3.5-turbo-0301

I expect new versions to slightly change the output you get from a prompt. However, with this recent update the output is shockingly bad! Instead of calling GPT 1x per article, I now have to change my app to call it 5+ times per article.

Use Case:

  • Input news article in German
  • Output (direct translation in english, simplified A1 german, english translation, simplified B1 german, english translation item)

Outcome:

  • gpt-turbo-0301 (works great, follows prompt, works consistently across many input articles)

  • gpt-turbo-0613 (poor result, does not follow prompt, does not work across many input articles, output not usable)

I prove this with more details in an example Google Colab, but this forum won’t allow me to publish this post if it inlcude links.

3 Likes

Hi Jordan,

Welcome to the forum. Interesting finding, I understand that you can’t send a link at the moment, I’m not 100% sure, but you might be able to message me the link and I’ll post it as an add on on your behalf for now.

On to the issue, the 0613 models are tuned to follow the system prompt more aggressively, I winder if you could try putting a “persona” prompt into the system message, something like “You are a veteran German and English translator, you will follow A1 and simplified B1 protocols when requested.” and then do as you would normally. I would be interested in the results of this test.

1 Like

Did know that. I remember that there was a note about that in documentation where they advised to not do instructions on the system prompt. This is no longer true?

Indeed, the GPT3.5 model in the 0613 edition will now follow system prompts quite well, I’m thinking that perhaps this is the change in performance you are seeing, it’s worth a try as it has worked for those who have tried it, at least for those I have spoken with.

2 Likes

That’s strange because the system prompt is the one that is pretty detailed and worked noticably better for me on 0301. It’s actually not performing on 0613.

Link won’t work and I can’t seem to message you… You can construct the link by putting the below together:

  • colab.research.google(dotcom)
  • /drive/1Gfh9L7yaVywxWONlsZ4WOJ7DA73J4u2U?usp=sharing item

For anyone who wishes to view the link : Google Colab

Thanks for the well documented data, hopefully will be useful in looking for issues.

I don’t have the spare time at the moment to look much more deeply that a cursory scan of the results for a solution, but it does appear that there is a difference in generated results.

If i get some time I will do some experimentation.

How does this look to you?

To me, this looks like your original 0301 replies, do you agree? If so then there must be some programmatic difference with your code, temperature perhaps?, system message?

Feel free to replicate my settings from the Playground into your code and experiment.

Full response

  1. Original German Text:
    “Die Hitzemarke von 30 Grad ist am Sonntag zum ersten Mal in diesem Jahr auch auf der Alpennordseite geknackt worden. Um 13.10 Uhr verzeichnete die Messstation in Binningen im Baselbiet 30,1 Grad, wie der staatliche Wetterdienst Meteoschweiz mitteilte.”

  2. English Translation:
    “The heat mark of 30 degrees was cracked for the first time this year on the north side of the Alps on Sunday. At 13:10, the measuring station in Binningen in Basel recorded 30.1 degrees, as reported by the state weather service Meteoschweiz.”

  3. A1/A2 German:
    “Am Sonntag war es auf der Nordseite der Alpen zum ersten Mal in diesem Jahr 30 Grad warm. Die Wetterstation in Binningen in Basel hat um 13.10 Uhr 30,1 Grad gemessen. Das hat der Staatliche Wetterdienst Meteoschweiz gesagt.”

  4. English A1/A2 Translation:
    “On Sunday, it was 30 degrees warm for the first time this year on the north side of the Alps. The weather station in Binningen in Basel measured 30.1 degrees at 13:10. The state weather service Meteoschweiz said this.”

  5. B1/B2 German:
    “Erstmals in diesem Jahr wurde am Sonntag auf der Alpennordseite die 30-Grad-Marke erreicht. Wie der staatliche Wetterdienst Meteoschweiz berichtete, registrierte die Messstation in Binningen im Baselbiet um 13.10 Uhr 30,1 Grad.”

  6. English B1/B2 Translation:
    “For the first time this year, the 30-degree mark was reached on the north side of the Alps on Sunday. As reported by the state weather service Meteoschweiz, the measuring station in Binningen in Basel recorded 30.1 degrees at 13.10.”

Next part:

  1. Original German Text:
    “Mit den steigenden Temperaturen, verbunden mit der seit Wochen anhaltenden Trockenheit, wuchs auch die Waldbrandgefahr. Unter anderem in den Kantonen Zürich, St.Gallen, Bern, Graubünden, Wallis, Waadt und den beiden Basel wurde diese vom Bundesamt für Umwelt (Bafu) bereits vor dem Wochenende als erheblich eingestuft.”

  2. English Translation:
    “With the rising temperatures, combined with the drought that has been ongoing for weeks, the risk of forest fires also increased. Among others, in the cantons of Zurich, St. Gallen, Bern, Graubünden, Valais, Vaud, and both Basels, this was classified as considerable by the Federal Office for the Environment (FOEN) even before the weekend.”

  3. A1/A2 German:
    “Die Temperaturen sind gestiegen und es ist seit Wochen trocken. Deswegen gibt es ein höheres Risiko für Waldbrände. In vielen Kantonen wie Zürich, St.Gallen, Bern und anderen hat das Bundesamt für Umwelt (Bafu) schon vor dem Wochenende gesagt, dass das Risiko groß ist.”

  4. English A1/A2 Translation:
    “The temperatures have increased and it has been dry for weeks. Therefore, there is a higher risk of forest fires. In many cantons like Zurich, St. Gallen, Bern and others, the Federal Office for the Environment (FOEN) said that the risk is high even before the weekend.”

  5. B1/B2 German:
    “Durch die wachsenden Temperaturen und die seit Wochen bestehende Dürre hat die Gefahr von Waldbränden zugenommen. Das Bundesamt für Umwelt (Bafu) hat das Risiko bereits vor dem Wochenende in den Kantonen Zürich, St.Gallen, Bern, Graubünden, Wallis, Waadt und den beiden Basel als erheblich eingestuft.”

  6. English B1/B2 Translation:
    “Due to the rising temperatures and the drought that has persisted for weeks, the danger of forest fires has increased. The Federal Office for the Environment (FOEN) classified the risk as significant in the cantons of Zurich, St. Gallen, Bern, Graubünden, Valais, Vaud, and both Basels even before the weekend.”

2 Likes

Thank you! Will check this out in more detail this week.

1 Like

So, I tried replicating what you have. I used the same tool you’re using on the platform.openai.com playground.

Using the settings from your screenshot, it outputs excactly what I’m seeing in my colab.

  • 0301 outputting as expected, exactly what my system prompt is asking.
  • 0613 ignores my system prompt completley. Just returns english back.

1 Like

Can you try this as a system prompt, please

You are an AI tasked with simplifying German news articles into simpler language levels, A1/A2 and B1/B2, while keeping key details and context intact. Your output should be in list format, with each item including: 1. original German text 2. English translation 3. A1/A2 German 4. English A1/A2 translation 5. B1/B2 German 6. English B1/B2 translation. Ensure context and the source of opinions are clear in all versions. Avoid complex or technical terms, and adopt a compassionate, empathetic tone where suitable.

This is an example
###
Full response

Original German Text:
"Die Hitzemarke von 30 Grad ist am Sonntag zum ersten Mal in diesem Jahr auch auf der Alpennordseite geknackt worden. Um 13.10 Uhr verzeichnete die Messstation in Binningen im Baselbiet 30,1 Grad, wie der staatliche Wetterdienst Meteoschweiz mitteilte.”

English Translation:
“The heat mark of 30 degrees was cracked for the first time this year on the north side of the Alps on Sunday. At 13:10, the measuring station in Binningen in Basel recorded 30.1 degrees, as reported by the state weather service Meteoschweiz.”

A1/A2 German:
“Am Sonntag war es auf der Nordseite der Alpen zum ersten Mal in diesem Jahr 30 Grad warm. Die Wetterstation in Binningen in Basel hat um 13.10 Uhr 30,1 Grad gemessen. Das hat der Staatliche Wetterdienst Meteoschweiz gesagt.”

English A1/A2 Translation:
“On Sunday, it was 30 degrees warm for the first time this year on the north side of the Alps. The weather station in Binningen in Basel measured 30.1 degrees at 13:10. The state weather service Meteoschweiz said this.”
###
2 Likes

Thanks for the quick reply. Yes, now when we provide the example in the sytem prompt it is formatting correctly. I’ll have to try this across multiple articles to see if the other dimensions of the content → prompt → output are equal quality. E.g. Accuracy, readability, etc.

This has been a great help though, thank you!

3 Likes

Hi,

I actually have exactly the same issue of worse quality on my output for the same prompt.
About the following thread I noticed that the screenshot of @Foxalabs is configure on gpt-4 while the discussion is about gpt-3.5 turbo 0613 vs 0301.

I’ll try to reformat my prompts as well and see if there is any improvements.

thanks

Yes, I did use GPT-4, if you see on my comment further on from that I include a demo of it being done with 3.5 by using a an example shot in the prompt

1 Like