Fine-tuning a single example but prompts will generate many at once

dyesdyes · September 5, 2023, 10:05pm

Hi,

Might be a silly question, so let me know if I’m going about this the wrong way.

I want to generate multiple pieces of content (30+). For instance, something close to it is generating names for a movie for a given topic.

Generating the 30+ initial movie names is fine, but then I want to rephrase them according to some rules I have.

This is where I would like to provide fine-tuning examples by giving:

Prompt: Bad movie title before rephrasing
Completion: Rephrased movie title that I consider good

Is it possible to create my fine tuning this way (one movie title at a time) but actually ask to rephrase 30+ titles at the same time (in a single API request)?

Will the model understand that it needs to apply the fine-tuning for each “sub-requests” in a single prompt?

elmstedt · September 6, 2023, 1:37am

Generally, the more different things you ask the model to do at one time and the more replications you request it to do, the worse results you’ll get.

It’s certainly possible it will be able to do it, but I think you’ll get better, more reliable, and consistent quality results if you discretize your requests.

With respect to the fine-tuning, the more closely your tuning examples reflect what you’re actually going to be prompting it with, the better the results will be.

So, if you want to be able to batch the re-writes through a fine-tuning model, you’ll get better results if all of your training data consists of batch requests and responses.

All that said, I suspect it will be much more cost effective to just run this through one of the existing models with a great system message and possibly a one-shot example or few-shot examples added on.

Remember, fine-tuned gpt-3.5-turbo costs 8x to use what the base model costs. So, unless the fine-tuning yields dramatically better results and you plan on doing just an absurd number of these requests, you may very well do better with good prompting and an extra iteration or two.

dyesdyes · September 6, 2023, 9:39am

Thanks a lot for this clear answer.

What do you mean by discretize in this context?

consistent quality results if you discretize your requests

Foxalabs · September 6, 2023, 9:41am

Discretize means split them up.

(random words for minimum post length)

dyesdyes · September 6, 2023, 9:49am

Alright

So I would get the best result by rephrasing the movie titles one by one instead of by batch.

elmstedt · September 6, 2023, 11:54am

That is my assumption.

It allows the model to focus only on the precise task at hand when doing each rephrasing.

When the model processes messages it looks at all the relationships between all the tokens. That’s why the computational complexity scales with the square of the context-length.

So, even though the models are very good at focusing their attention on the relevant details of the context, all those irrelevant details will exert subtle influences on the generation.

So my expectation is you will get the best quality results more consistently by processing them individually.

Topic		Replies	Views
Finetuning for shortening prompts Documentation fine-tuning	10	2734	December 24, 2023
Fine Tuning - Should we have a file for each style prompt? API	16	1472	October 20, 2021
Fine tuning - how exactly does it work? API	6	1772	December 23, 2023
Do I need a fine tuning? API	4	425	January 3, 2024
Efficient Processing of Multiple Complex Prompts with GPT-4 API gpt-4 , api , beginner-help	13	8934	November 7, 2023

Fine-tuning a single example but prompts will generate many at once

Related Topics