Training data: is it legal to use same prompt 1000x times?

I’m workfing on a trained model for ad copies. And I have several thousand ad copies that I want to use as for the trained model. Is it OK to make a prompt like:
Prompt 1: “Write a Facebook Ad copy”
Completion 1: “… ad copy …”

Prompt 2: “Write a Facebook Ad copy”
Completion 2: “… ad copy2 …”

So I’ll end up having 1000K+ similar prompts.

Will it work for the actual prompts like “Write a Facebook Ad copy for MyAwesomeAmazonStore that sells this, and this”. Mention this features and X discount ? Or should I somehow generate better prompts for the training data?

Thank you :pray:

If you just use the same prompt, it will just take one of those randomly. So you should add more specific info on it. Luckily, you don’t have to do it by hand. You can ask GPT-3 to first classify your Completion data and then later use it as training data.

For example:
AD1: Wow, product X is the best printer you will ever need. Now we’re giving such an such discount blabla.

Then you let GPT-3 guess the product type and whatever relevant information you need.
Product:
Discount:

Later your training data become something like
Write a facebook Ad copy:
Product name: something something
Relevant info: something something
Completion: Your original ad copy.

I hope this helps

1 Like

Thank you! That’s a good suggestion, I’ll try that

1 Like