Fine tuning gpt-3 to write in the writing style of a news outlet

amra.dorjbayar · January 11, 2023, 1:18am

Hi everyone,

I would like to try to train a fine-tuned model that writes news articles in the writing style of a specific news outlet, hen given prompts with news facts.

How much training data would I need for that? Is more the better?

How can I train it to mimic the writing style?

So far, my idea is to collect a lot of news articles. Ask gpt-3 to extract the news facts from these articles. And then switch their places, so that in the training data the resulting news facts become the prompt and the news article themselves become the output.

Has anyone tried that before? Do you think it would work?

PaulBellow · January 11, 2023, 1:38am

Welcome to the community.

Sounds like a good action plan to me.

I would use at last 400+ examples. The more the better - usually.

Having keywords in the prompt and the article (as much of it as you can fit) is a good idea.

I think it would probably have good results with enough samples.

amra.dorjbayar · January 11, 2023, 1:48am

Thanks a lot for your reply. I’m quite new at this but very much eager to learn and experiment.

Here are some more questions:
I could go up to 36.000 articles. Is that too much? If I add classifications like categories per article. Could they still fit in one fine-tuning model? Or should I create seperate models per category?

PaulBellow · January 11, 2023, 2:08am

I mean, I guess it depends on your budget. You’re charged for each token used in the training data - once for each epoch. (And most fine-tunes go about 4 epochs…)

You should try to fit as much metadata in the prompt as you can while fitting the article in the completion spot. You only have just over 2000 tokens to work with.

Here’s a good introduction page in the docs.

parakeet · February 2, 2023, 10:49pm

@amra.dorjbayar Did you get anywhere with this?

provisio · September 24, 2023, 4:22pm

@parakeet - How far have you gotten with your exploration here? Looking to do the same thing

_j · September 24, 2023, 4:56pm

Here’s a prompt. Make The AI itself document what you want. Then have it follow that format.

I would like to have you write a document about the lede and 5Ws + H system of traditional top-down truncatable journalism writing.

For news writing, it may be tedious, because a story should continue to present new newsworthy truthful information and details as you continue to read, and thus you’re providing just as much info to the AI as it is supposed to write. And then it still has tons of human-feedback training on its little articles with “in conclusion, additionally, ultimately”, to overcome. gpt-3.5-turbo-instruct may do better with its smaller amount of chat training.

Fine tune a base model would be a whole bunch of examples of what information and instructions you provide and what comes out of that.

parakeet · September 25, 2023, 5:32am

I found a good level of success. My use case was writing stories based on interview transcripts. I settled on a lengthy prompt to guide GPT-4 via API. It also helped to provide it with the name of the publication to emulate, which basically only does this story format (I once give it the prompt minus a transcript, and it spat out a generic version of the stuff we always do).

Topic		Replies	Views
How to mimic my writing sytle via fine-tuning? Prompting	1	2306	April 4, 2023
Fine-tuned model handles prompts differently Prompting	6	955	November 23, 2023
Fine-Tuning 3.5 Turbo for writing style/tone API	1	1648	September 27, 2023
Are fine-tuned models a good way to give GPT a specific tone of voice? API api	5	3968	July 20, 2023
Training gpt-3.5 to autocomplete for a niche domain and a specific writing style Community chatgpt	13	1877	July 25, 2024

Fine tuning gpt-3 to write in the writing style of a news outlet

Related topics