I feel like I’ve read the fine tuning documentation, read as many things on this forum I could around fine tuning, but still feeling a bit lost, hoping I can get some assistance. Here is what I’m trying to do:
I want to fine tune a model that understand a set of guidelines/laws, such that when I provide it text it can say Yes or No to whether or not it has adhered to the guidelines/laws provided. For example:
Guideline: any advertisement related to marketing should not include any preferences, limitations, specifications, or discrimination based on certain protected classes, such as age, ancestry, disability, familial status, race, religion, sex, and the use of guide animals. The advertisement should also not imply any preferences, exclusions, or limitations based on these protected classes. Examples of prohibited practices include referencing specific religious or ethnic institutions or landmarks nearby.
Example Listing: Advertisement says “Just around the corner from St. Andrew’s Catholic Church.”
Answer: No
Playground link for context: OpenAI API
It’s unclear to my what I would provide in the prompt and completion for this. Would I provide the same prompt which is the guidelines and different completions that would be incorrect? Note I have multiple guidelines with examples of what not to say this is just an example. TIA!
Additionally, I would like to be able to use the model to write marketing text that follows these guidelines as well, not sure if there is way to get both (yes or no to if it followed the guidelines) and have the model write text adhering to the guidelines
Train a base model with the sample input and sample output format for multiple possible cases (listings) that you will have and then prompt GPT to answer based on them.
You could provide the guidelines in the context window along with a sample or two and then ask for an user input listing.
The first method will be more targeted towards your approach but the fine-tuning process is itself a bit time consuming, especially if you don’t have a good number of samples at hand (~100). The second method would allow you to be flexible but you would need to prompt it really well to make the GPT output targeted towards only a yes or a no,
Thanks @udm17 I think what your saying to point 1 is give it listing that are both good and bad PROMPT, with the completion of YES or NO? Furthermore to point 2, I like this approach but how would I then scale this to all my data, as this would just live in the playground.
Thanks again @udm17 . I did make an edit to my original post where I also want to “Additionally, I would like to be able to use the model to write marketing text that follows these guidelines as well, not sure if there is way to get both (yes or no to if it followed the guidelines) and have the model write text adhering to the guidelines” . Would I follow the same principles provided?
Context: In my experience, fine-tuning will certainly focus on the data and use mainly the new corpus and its style to answer to questions, however, the answers are not strictly the ‘truth’. In the case of responding Yes/No to questions like the above, you can come up with variations of the questions that are true but the model answers false.
In other words, so far I’ve seen that fine-tuning is not the ideal way for 'TRUTH" verification.
Is this statement ‘true’ or ‘false’? has anyone fine-tuned a model and obtain answers that are highly reliable?
On the other hand, I’ve found that embedding is a much better way to create a corpus for Q&A.
Totally agree @juan_olano embedding would make more sense. I’ve additionally asked to see if I can train the model to “model to write marketing text that follows these guidelines as well.” I assume in that case fine-tuning makes more sense?
Yes, in my experience the marketing text case would make more sense with fine-tuning.
Now, I wonder: If I fine-tune the model with a huge corpus, will it be closer to answer the truth? how is this correlated to amount of data? If I want 95% of my questions being answered properly (answer = true answer, not just completion) after fine-tuning, how much data should I feed the model with?
Because, for style and use of corpus, just a few examples are enough, but for truth, I am thinking that I would need a very large corpus.
@wfhbrian I want get your thoughts on doing this as well
"Additionally, I would like to be able to use the model to write marketing text that follows these guidelines as well, not sure if there is way to get both (yes or no to if it followed the guidelines) and have the model write text adhering to the guidelines”
How could I accomplish both or would they need to be seperate?
I have been pondering the same question for several days now. My main goal is to get ChatGPT to write articles in specific domains and I have also attempted fine-tuning. However, the results were not as good as directly using GPT-4 for creating content. This is because it is difficult for me to generate suitable prompt/completion data during the fine-tuning process. This might be similar to the marketing text writing issue you mentioned.
In areas such as ChatBots, noun explanations, and customer service systems, the current fine-tuning methods should be more suitable.
This is my progress so far, but I am still trying to make further attempts.
If OP figures out how this should be done, please drop a message here. I want to accomplish something similar to your use case … unfortunately i do not understand one bit of everything others replied to you … so hopefully, i can understand with a ‘visual’ representation