Fine tuned model returns acceptable result only once, then rubbish every single time

Hi.
I’ve fine tuned a couple of models of davinci for completion.
When I use them I get an acceptable result (the structure of the completion is correct, the response is okay - could be fine tuned further) but only for the first request - then it starts returning rubbish (random text from the input prompt or fine tune data) every single time.
Seems like a bug to me.
Thanks.

PS. Billing limits are BS and take too long to change.

Welcome to the forum.

Can you give us an example of your prompt and output? What settings/model are you using?

When you get garbage out, it’s usually because you sent garbage in…

1 Like

Also give us some steps to reproduce the bug. When you say “first request”, are you running multiple API calls on a loop? Or do you request multiple results in the request body to the API? What’s the timing between one “first request” and another? Are they the same input?

I tested a bit on one of my fine tunes, but it only returns rubbish for the first three attempts (which were things like “test” lol), and then worked fine.

1 Like

I fine tuned the davinci model with 138 examples. The prompt includes an excerpt of a document and asks to return a response in JSON format with two keys, completion is the JSON - it’s around 31 tokens. The prompt takes all available tokens so all examples are up to the limit - above 2000 tokens.

The first prompt I send after fine tuning returns exactly what I fined tuned for - maybe not ideal but pretty good but each subsequent prompt returns “garbage” - which is just completely unacceptable response - very rarely a JSON at all, in most cases just some garbage from the trained data (and not only) and by garbage here I mean like 31 tokens of spaces (empty text, which sometimes is found in the training data) or things like that.

I have around 1000 examples I could train on but it costs $25 to train one set and it takes forever to change the max budget of $200…

I’ve read today that 3.5-turbo can be fined tuned so will look at it once my budget reset or they increase the limit…

1 Like

Yeah, it’s weird that it’s good once than busted on subsequent calls… could be the training data itself (how you set it up) and/or the lack of examples… i’d try with 500 maybe on a newer model?

JSON could be a hint. It can mess special formatting up a bit. Also a common mistake I’ve made before is to try to save on tokens and not give it sufficient instructions. It would work well at times, but be inconsistent.

It’s an interesting day to bring it up, especially with the major changes to fine tuning. Could be a fluke that just affected yesterday? Maybe try retraining it. Davinci is 5-10x cheaper today.

1 Like