Blog outline training

Hello all,

I am so excited about GPT-3. I want to build a model(curie) that can output a blog outline for a given title. I prepared a dataset of 450 examples, in the format

{“prompt”:“TITLE\n\n###\n\n\n”, “completion”:“heading 1\nheading 2\n… END”}

It does not seem to produce accurate output. It mostly produces just a normal text output instead of a list of points. Wondering if I need to increase the sample size. Here are my training parameters,

  • n_epochs=2
  • learning_rate_multiplier=0.01
  • batch_size=4
  • use_packing=True,
  • prompt_loss_weight=1.00

Would appreciate your feedback. I built sample data from multiple domains.

You might need a demarcation of some sort to let it know the title section has ended. But also it would be helpful to see a couple full examples of your training data. Also, whitespace is generally confusing to GPT-3 for some reason. So you might need to append “BODY” or some other token at the end of all prompts/inputs as the demarcation instead of whitespace. The key is to use a consistent and unique token.

It’s hard to troubleshoot precisely without explicitly seeing sample data and outputs.

Thank you, here is one example I have,

{“prompt”: “Best Small Business Ideas \n\n###\n\n”, “completion”: “1. popular small business ideas\n2. free resources for your small business\n3. small business ideas by personality type\n4. popular small business ideas videos\n5. franchise opportunities\n6. small business ideas: complete guides\n7. start a business in your state END”}

And what does the finetuned curie output for that look like? First issue I see is that it’s hard to predict the output given the input, so the model may not be learning a good mapping between the two and generating random completions instead. The model should be able to generate a list however, it’s odd that it’s not doing that (especially since list-generation is a zero-shot task anyway).

Now what are you putting in at inference time?

This sometimes produces good output,

Sometimes not great.

You’re getting inconsistency because your training input does not match your inference input. That’s why I said you need to use a more consistent demarcation.

1 Like

Thank you, I will try giving some direction to the engine (say giving 1st point) to improve the odds.

That’s not what I mean. I mean that your training looks like this:

TITLE:


[Title here]


While your inference looks like

[Title here]
1.

You’re using an entirely different format at inference time. You need to use the same exact format for training and inference. There’s actually nothing wrong with your inference format so you should make your training format match that exactly, so that every prompt ends with 1. and so that every inference does as well. This will train the model to generate consistent output immediately after the 1.

1 Like

Thank you. I changed the training data to,

{“prompt”: “Best Small Business Ideas\n1.”, “completion”: " popular small business ideas\n2. free resources for your small business\n3. small business ideas by personality type\n4. popular small business ideas videos\n5. franchise opportunities\n6. small business ideas: complete guides\n7. start a business in your state END"}

Here is the output now,

Not sure what I miss.

I would need to see a lot more samples of your training data. You said you have 450 samples? Can you post the JSONL somewhere? Also make sure you’re using the correct finetune version in the Playground. They usually appear in descending order.

I have sent you message.

1 Like