Fine tuning gpt-3 to write in the writing style of a news outlet

Hi everyone,

I would like to try to train a fine-tuned model that writes news articles in the writing style of a specific news outlet, hen given prompts with news facts.

How much training data would I need for that? Is more the better?

How can I train it to mimic the writing style?

So far, my idea is to collect a lot of news articles. Ask gpt-3 to extract the news facts from these articles. And then switch their places, so that in the training data the resulting news facts become the prompt and the news article themselves become the output.

Has anyone tried that before? Do you think it would work?

Welcome to the community.

Sounds like a good action plan to me.

I would use at last 400+ examples. The more the better - usually.

Having keywords in the prompt and the article (as much of it as you can fit) is a good idea.

I think it would probably have good results with enough samples.

1 Like

Thanks a lot for your reply. I’m quite new at this but very much eager to learn and experiment.

Here are some more questions:
I could go up to 36.000 articles. Is that too much? If I add classifications like categories per article. Could they still fit in one fine-tuning model? Or should I create seperate models per category?

I mean, I guess it depends on your budget. You’re charged for each token used in the training data - once for each epoch. (And most fine-tunes go about 4 epochs…)

You should try to fit as much metadata in the prompt as you can while fitting the article in the completion spot. You only have just over 2000 tokens to work with.

Here’s a good introduction page in the docs.

1 Like

@amra.dorjbayar Did you get anywhere with this?