I would like to try to train a fine-tuned model that writes news articles in the writing style of a specific news outlet, hen given prompts with news facts.
How much training data would I need for that? Is more the better?
How can I train it to mimic the writing style?
So far, my idea is to collect a lot of news articles. Ask gpt-3 to extract the news facts from these articles. And then switch their places, so that in the training data the resulting news facts become the prompt and the news article themselves become the output.
Has anyone tried that before? Do you think it would work?
Welcome to the community.
Sounds like a good action plan to me.
I would use at last 400+ examples. The more the better - usually.
Having keywords in the prompt and the article (as much of it as you can fit) is a good idea.
I think it would probably have good results with enough samples.
Thanks a lot for your reply. I’m quite new at this but very much eager to learn and experiment.
Here are some more questions:
I could go up to 36.000 articles. Is that too much? If I add classifications like categories per article. Could they still fit in one fine-tuning model? Or should I create seperate models per category?
I mean, I guess it depends on your budget. You’re charged for each token used in the training data - once for each epoch. (And most fine-tunes go about 4 epochs…)
You should try to fit as much metadata in the prompt as you can while fitting the article in the completion spot. You only have just over 2000 tokens to work with.
Here’s a good introduction page in the docs.
@amra.dorjbayar Did you get anywhere with this?
@parakeet - How far have you gotten with your exploration here? Looking to do the same thing
Here’s a prompt. Make The AI itself document what you want. Then have it follow that format.
I would like to have you write a document about the lede and 5Ws + H system of traditional top-down truncatable journalism writing.
For news writing, it may be tedious, because a story should continue to present new newsworthy truthful information and details as you continue to read, and thus you’re providing just as much info to the AI as it is supposed to write. And then it still has tons of human-feedback training on its little articles with “in conclusion, additionally, ultimately”, to overcome.
gpt-3.5-turbo-instruct may do better with its smaller amount of chat training.
Fine tune a base model would be a whole bunch of examples of what information and instructions you provide and what comes out of that.
I found a good level of success. My use case was writing stories based on interview transcripts. I settled on a lengthy prompt to guide GPT-4 via API. It also helped to provide it with the name of the publication to emulate, which basically only does this story format (I once give it the prompt minus a transcript, and it spat out a generic version of the stuff we always do).