Training data: is it legal to use same prompt 1000x times?

mike_re · January 24, 2023, 12:54pm

I’m workfing on a trained model for ad copies. And I have several thousand ad copies that I want to use as for the trained model. Is it OK to make a prompt like:
Prompt 1: “Write a Facebook Ad copy”
Completion 1: “… ad copy …”

Prompt 2: “Write a Facebook Ad copy”
Completion 2: “… ad copy2 …”

So I’ll end up having 1000K+ similar prompts.

Will it work for the actual prompts like “Write a Facebook Ad copy for MyAwesomeAmazonStore that sells this, and this”. Mention this features and X discount ? Or should I somehow generate better prompts for the training data?

Thank you

amra.dorjbayar · January 24, 2023, 1:03pm

If you just use the same prompt, it will just take one of those randomly. So you should add more specific info on it. Luckily, you don’t have to do it by hand. You can ask GPT-3 to first classify your Completion data and then later use it as training data.

For example:
AD1: Wow, product X is the best printer you will ever need. Now we’re giving such an such discount blabla.

Then you let GPT-3 guess the product type and whatever relevant information you need.
Product:
Discount:

Later your training data become something like
Write a facebook Ad copy:
Product name: something something
Relevant info: something something
Completion: Your original ad copy.

I hope this helps

mike_re · January 24, 2023, 2:17pm

Thank you! That’s a good suggestion, I’ll try that

Topic		Replies	Views
Using multiple identical prompts with unique completions Prompting	2	668	December 20, 2023
Fine Tuning - Should we have a file for each style prompt? API	16	1942	October 20, 2021
Fine tune model with empty prompts API	4	1598	December 17, 2023
Should prompts be unique for fine-tuning? Prompting	9	1670	December 25, 2023
GPT-3 in practice over larger data sets Prompting	1	612	November 18, 2021

Training data: is it legal to use same prompt 1000x times?

Related topics