Do new gpt4-turbo models have more trouble with examples?

erik6 · February 21, 2024, 10:24pm

When using the new gpt4-turbo models it seems to work better when I leave examples out of my system prompts. As soon as I put them in (even with 3-4 different examples), the model sticks too closely to those examples, and I basically get an overfitted answer.

Let’s say that I want json output: I’ve had better results by providing an output template, than to give 3 output examples. When I use the examples, it will stick too closely to the values I had used there.

I should say that I always work with temperature 0 or close to 0 as that makes the most sense for the type of work that I do. I did not see this before the turbo models were introduced though.

Any others experiencing this, or do I just need to work on my “example engineering” craft?

Diet · February 21, 2024, 11:47pm

Personally, I’ve never gotten multi-shot working satisfactorily.

We’re discussing multishot here, if you wanna have a look at it:

for json output, you can do stuff like “start with {” and “you are part of a larger system, and must provide json output - any other response will cause the system to crash.”

it’s generally capable of following simple schemas (I like using TS type definitions) fairly reliably.

Adding examples tends to add a lot of unnecessary noise to your signal with minimal utility, as you’ve discovered. Working on your description on what it should do, and cutting as much noise as possible is generally a good idea IMO.

so tldr: this individual (me) believes using examples aren’t a good idea in most cases.

erik6 · March 4, 2024, 12:25am

Sorry for the late reply, but thanks for taking the time to answer in detail!

By the way, I never have issues with the model generating json: I just give a json output instruction, and also add the json parameter to the call. My question was more about the best way to specify the specific format of the generated json.

You answer was useful though and confirmed some of my own findings, thanks!

Topic		Replies	Views
Gpt3 turbo not giving the good result even after fine-tuning API	14	1829	September 18, 2023
Few-shot examples "leaking" into responses in Q&A system Prompting prompt-engineering	5	741	October 30, 2024
API vs non-API results are horribly inaccurate creating JSON objects API gpt-4 , api , code-interpreter , json	10	1212	April 17, 2024
Valid json every time? Prompting	17	11885	January 3, 2024
New gpt-4-turbo-preview saying it can't help on complex prompt Prompting gpt-4 , api , gpt-4-turbo	7	2579	January 29, 2024

Do new gpt4-turbo models have more trouble with examples?

Related topics