Fine Tuning Prompt Clarification

I’ve pulled together a training set of nearly 300 prompts/completions but I’m getting really poor results.

When I use the text-davinci-002 I get decent results but with my trained model, I’m not.

I have a feeling I’m doing the training data wrong, here is an example:

{“prompt”: “Rewrite the Article to include the Keywords. Keep any HTML formatting. Dont’ add new paragraphs. Text changes to the Article should be minimal to retain the original content. Keywords can be changed to plural or singular.\n\nArticle: apples are <a href="#">delicious\nkeywords: [green]\nRewrite to include keywords: green apples are <a href="#">delicious\n\nArticle: Digital PR is great\nkeywords: [search engine optimisation, SEO]\nRewrite to include keywords: Digital PR is great for search engine optimisation and SEO\n\nArticle: we have a range of quality engine oil that will keep your car running smoothly. There are many options to suit your vehicle\u2019s requirements, so you\u2019re bound to find a compatible type.\nkeywords: [diesel engine oil, engine oil for cars]\nRewrite to include keywords: we have a range of quality engine oil, such as diesel engine oil that will keep your car running smoothly. There are many options to suit your vehicle\u2019s requirements, so you\u2019re bound to find a compatible type of engine oil for cars.\n\nArticle: Unreliable brakes can be extremely dangerous. Even if your brakes work, their performance can decrease over time, and your stopping distance may be compromised. So, you need check and test all your brake components regularly, keeping an eye out for signs of wear and making any necessary repairs.\nkeywords: [car brakes, brakes for cars]\nRewrite to include keywords: Unreliable brakes for cars can be extremely dangerous. Even if your car brakes work, their performance can decrease over time, and your stopping distance may be compromised. So, you need check and test all your brake components regularly, keeping an eye out for signs of wear and making any necessary repairs.\n\nArticle: Unreliable brakes can be extremely dangerous. Even if your brakes work, their performance can decrease over time, and your stopping distance may be compromised. So, you need check and test all your brake components regularly, keeping an eye out for signs of wear and making any necessary repairs.\nkeywords: [car brakes, brakes for cars]\nRewrite to include keywords: Unreliable brakes for cars can be extremely dangerous. Even if your car brakes work, their performance can decrease over time, and your stopping distance may be compromised. So, you need check and test all your brake components regularly, keeping an eye out for signs of wear and making any necessary repairs to your car brakes.\n\nArticle:

This is a general list of our batteries, to search for the correct car battery, please type your reg number or select your vehicle from the menu to see our recommendations

\nkeywords: [car batteries, batteries for cars]\nRewrite to include keywords:

This is a general list of our car batteries, to search for the correct car battery, please type your reg number or select your vehicle from the menu to see our recommendations for batteries for cars.

\n\nArticle: \n\n At GSF Car Parts we have a wide range of car engine parts for you to browse, including parts for your Perodua Kelisa. Here, you’ll find everything you need to keep your car’s engine in perfect working order, including parts from top brands and trusted manufacturers. The engine is what provides the power that drives your car, and as a result, it\u2019s one of the most hard-working elements of a vehicle. That means that general wear and tear will take its toll on many of its components over time, so you need to check your engine regularly and replace its parts as needed. It’s important to buy spare parts for your vehicle that are compatible with the make and model of your car. To make this easy, we’ve provided a handy means of filtering our wide range of products. Simply enter your registration number into the box above, confirm the make and model, and we’ll do the rest. The car engine components we sell include belt drive parts, crankshaft parts and crankcases, cylinder head parts, engine timing control parts, and much more. We also have a variety of lubrication parts for your engine which are sure to keep things running smoothly. While you’re here, bear in mind that we also have an impressive selection of excellent value service parts, for light repairs and common replacements. We offer free UK delivery when you spend over \u00a325 across our online store. What’s more, we also offer a 365-day returns period for peace of mind*. So, there are plenty of reasons to buy engine parts for your car from us. Take a look at all of our car parts and accessories and place your order today.\nkeywords: [‘perodua kelisa’]”, “completion”: “\n\n At GSF Car Parts we have a wide range of car engine parts for you to browse, including parts for your Perodua Kelisa. Here, you’ll find everything you need to keep your car’s engine in perfect working order, including parts from top brands and trusted manufacturers. The engine is what provides the power that drives your car, and as a result, it\u2019s one of the most hard-working elements of a vehicle. That means that general wear and tear will take its toll on many of its components over time, so you need to check your engine regularly and replace its parts as needed. It’s important to buy spare parts for your vehicle that are compatible with the make and model of your car. To make this easy, we’ve provided a handy means of filtering our wide range of products. Simply enter your registration number into the box above, confirm the make and model, and we’ll do the rest. The car engine components we sell include belt drive parts, crankshaft parts and crankcases, cylinder head parts, engine timing control parts, and much more. We also have a variety of lubrication parts for your engine which are sure to keep things running smoothly. While you’re here, bear in mind that we also have an impressive selection of excellent value service parts, for light repairs and common replacements. We offer free UK delivery when you spend over \u00a325 across our online store. What’s more, we also offer a 365-day returns period for peace of mind*. So, there are plenty of reasons to buy engine parts for your car from us. Take a look at all of our car parts and accessories and place your order today.”}

Now, when I load the trained model in Playground I get results like this:

Is it my training data or my prompts in the playground or something else?

When I run that using text-devinci-002 I get pretty much exactly what I want.

Thanks so much for any help or guidance

1 Like

What settings are you using?

You might also try having a stop word (ie <|endoftext|>) so that it learns when to stop…

Also, 300 might not be enough. The rule of thumb (I believe) is to double next… so 600 samples and see if it’s better. Good to test smaller, though.

Really could be any of a number of things, but I hope this helps.

P.S. Welcome to the forum!

I see ~3 problems.

  1. Writing prompts for fine-tuning is different than writing prompts for ChatGPT or completion endpoints. You should not include any instructions in the prompt for fine-tuning. The model will learn the patterns of what you want done from the examples.

    So your template would look something like this:

    Article: {{article_content}}
    Keywords: {{keywords_list}
    

    With enough examples you can remove the semantic labels (Articles:, Keywords:) too.

    Personal preference, but I would remove the brackets around the SEO keywords because they seem extraneous.

  2. In the playground, your fine-tuned model is repeating your prompt text most likely because you didn’t use a separator at the end of your prompt. This needs to be appended to the end of every prompt in your training dataset and included at the end of your playground prompt text. Common ones are “###” or “->” - this helps the model know when to start the completion instead of trying to keep writing your prompt.

  3. Make sure to use a stop sequence too in the dataset, appended to the end of every completion. “/n/n###/n/n” is common - where /n is a line break. This helps the model reach a stopping point and stop its output. Its value needs to be added in the playground settings when you’re testing and passed in API calls.

Lastly, and I only mention this because I think it might help you and others with similar problems: I created a platform called Entry Point that helps eliminate most of these “gotchas” so you can get to results faster. You can read more about it and how to write better fine-tuning prompts in this article: Fine-tune Large Language Models the Easy Way: Complete Guide | Entry Point AI