Fine Tuning Help defining Prompt/Completion

chirag.shah285 · March 30, 2023, 1:01pm

Hey all,

I feel like I’ve read the fine tuning documentation, read as many things on this forum I could around fine tuning, but still feeling a bit lost, hoping I can get some assistance. Here is what I’m trying to do:

I want to fine tune a model that understand a set of guidelines/laws, such that when I provide it text it can say Yes or No to whether or not it has adhered to the guidelines/laws provided. For example:

Guideline: any advertisement related to marketing should not include any preferences, limitations, specifications, or discrimination based on certain protected classes, such as age, ancestry, disability, familial status, race, religion, sex, and the use of guide animals. The advertisement should also not imply any preferences, exclusions, or limitations based on these protected classes. Examples of prohibited practices include referencing specific religious or ethnic institutions or landmarks nearby.

Example Listing: Advertisement says “Just around the corner from St. Andrew’s Catholic Church.”
Answer: No
Playground link for context: OpenAI API

It’s unclear to my what I would provide in the prompt and completion for this. Would I provide the same prompt which is the guidelines and different completions that would be incorrect? Note I have multiple guidelines with examples of what not to say this is just an example. TIA!

Additionally, I would like to be able to use the model to write marketing text that follows these guidelines as well, not sure if there is way to get both (yes or no to if it followed the guidelines) and have the model write text adhering to the guidelines

wfhbrian · March 30, 2023, 1:06pm

Your prompt would just be the ad text. And the completion would be yes/no.

chirag.shah285 · March 30, 2023, 1:08pm

Thanks @wfhbrian for the quick reply, so how do I feed it the guidelines/laws it needs to adhere to?

udm17 · March 30, 2023, 1:09pm

Hi Chirag,

There are two ways you could go about this.

Train a base model with the sample input and sample output format for multiple possible cases (listings) that you will have and then prompt GPT to answer based on them.
You could provide the guidelines in the context window along with a sample or two and then ask for an user input listing.

The first method will be more targeted towards your approach but the fine-tuning process is itself a bit time consuming, especially if you don’t have a good number of samples at hand (~100). The second method would allow you to be flexible but you would need to prompt it really well to make the GPT output targeted towards only a yes or a no,

chirag.shah285 · March 30, 2023, 1:12pm

Thanks @udm17 I think what your saying to point 1 is give it listing that are both good and bad PROMPT, with the completion of YES or NO? Furthermore to point 2, I like this approach but how would I then scale this to all my data, as this would just live in the playground.

udm17 · March 30, 2023, 1:13pm

Yeah. You got that right. Hopefully, that will work but I suggest you try out number 2 first as it is less time consuming

chirag.shah285 · March 30, 2023, 1:16pm

Thanks again @udm17 . I did make an edit to my original post where I also want to “Additionally, I would like to be able to use the model to write marketing text that follows these guidelines as well, not sure if there is way to get both (yes or no to if it followed the guidelines) and have the model write text adhering to the guidelines” . Would I follow the same principles provided?

juan_olano · March 30, 2023, 1:16pm

I have a question.

Context: In my experience, fine-tuning will certainly focus on the data and use mainly the new corpus and its style to answer to questions, however, the answers are not strictly the ‘truth’. In the case of responding Yes/No to questions like the above, you can come up with variations of the questions that are true but the model answers false.

In other words, so far I’ve seen that fine-tuning is not the ideal way for 'TRUTH" verification.

Is this statement ‘true’ or ‘false’? has anyone fine-tuned a model and obtain answers that are highly reliable?

On the other hand, I’ve found that embedding is a much better way to create a corpus for Q&A.

wfhbrian · March 30, 2023, 1:17pm

They’re implicit in your training data.

chirag.shah285 · March 30, 2023, 1:18pm

ok I see what your saying, thanks again @wfhbrian

chirag.shah285 · March 30, 2023, 1:19pm

Totally agree @juan_olano embedding would make more sense. I’ve additionally asked to see if I can train the model to “model to write marketing text that follows these guidelines as well.” I assume in that case fine-tuning makes more sense?

juan_olano · March 30, 2023, 1:23pm

Yes, in my experience the marketing text case would make more sense with fine-tuning.

Now, I wonder: If I fine-tune the model with a huge corpus, will it be closer to answer the truth? how is this correlated to amount of data? If I want 95% of my questions being answered properly (answer = true answer, not just completion) after fine-tuning, how much data should I feed the model with?

Because, for style and use of corpus, just a few examples are enough, but for truth, I am thinking that I would need a very large corpus.

chirag.shah285 · March 30, 2023, 1:26pm

@juan_olano good point, how would you quantify a few examples for building the text?

chirag.shah285 · March 30, 2023, 1:28pm

@wfhbrian I want get your thoughts on doing this as well

"Additionally, I would like to be able to use the model to write marketing text that follows these guidelines as well, not sure if there is way to get both (yes or no to if it followed the guidelines) and have the model write text adhering to the guidelines”

How could I accomplish both or would they need to be seperate?

juan_olano · March 30, 2023, 1:37pm

While @wfhbrian shares his answer, I’d say that I would experiment with as large as possible corpus to fine tune the model.

In the fine-tuning, I would create as many prompts as possible with questions and their answers.

To generate questions you can use GPT4:

Give it a text and instruct it to write, say, 10 questions about that text, and then create 10 entries in your JSONL with it.

And then I’d also add to the JSONL just data with no prompt.

I would probably experiment like that.

leoprctmp · March 30, 2023, 1:44pm

I have been pondering the same question for several days now. My main goal is to get ChatGPT to write articles in specific domains and I have also attempted fine-tuning. However, the results were not as good as directly using GPT-4 for creating content. This is because it is difficult for me to generate suitable prompt/completion data during the fine-tuning process. This might be similar to the marketing text writing issue you mentioned.

In areas such as ChatBots, noun explanations, and customer service systems, the current fine-tuning methods should be more suitable.

This is my progress so far, but I am still trying to make further attempts.

wfhbrian · March 30, 2023, 4:50pm

Use GPT-4 to re-write with your requirements in the system role prompt, and then validate the output using the fine-tuned model.

sant3e · March 31, 2023, 12:42pm

If OP figures out how this should be done, please drop a message here. I want to accomplish something similar to your use case … unfortunately i do not understand one bit of everything others replied to you … so hopefully, i can understand with a ‘visual’ representation

Topic		Replies	Views
Are fine-tuned models a good way to give GPT a specific tone of voice? API api	5	3930	July 20, 2023
Training gpt-3.5 to autocomplete for a niche domain and a specific writing style Community chatgpt	13	1830	July 25, 2024
What's better for the type of chatbot I am building? Fine tune or embedding? Community chatgpt , api	10	2249	August 20, 2023
Fine-Tuning with Non-Prompt/Completion Data: Seeking Advice for Direct Text-Based Training? API gpt-4 , chatgpt , fine-tuning , api	3	428	August 23, 2024
Fine tuning for writing style - lessons and questions API fine-tuning	5	3002	January 17, 2024

Fine Tuning Help defining Prompt/Completion

Related topics