Replacing legacy curie model

I have finetuned a curie model for a conditional generation task. Think about it like this: given a small text create an appropriate headline. It’s in production and everybody is super happy with the results. Now they are going to shut down curie and I have to retrain, however I get nowhere near the previous performance. Interstingly there is no real differnced in performance regardless of whether I train babbage or davinci.

Perviously my training went like this: I had an initial data set with a ~1500 texts. For each of those I have provided 2 different options for a headline. So in total there where ~3000 examples. Subsequently we identified texts for which the model didn’t produce good results. Then we continued training with these smaller datasets (100-500 examples). Training only for half the number of epochs and half the learning rate multiplier. So I can’t even use the same training paradigma.

Is there anybody in a similar situation and has some advice regarding how to improve the performance.

1 Like

Hey Max! Have you tried Any of our other models (like 3.5 Turbo Instruct) or GPT-4 for this? It sounds like a use case that these models would do really good at!

If you have a specific example with a prompt and output that is working well with Curie, would love to see it and try it out.

1 Like