Persona based fine-tuning

Hello,

I am referring to the link (Use this JSON User Profile template to train/fine tune GPT-4), there is an attempt to finetune an LLM for a persona. I have the following queries.

  1. The format presented in the above link is a standard format to fine-tune the OpenAI LLM for a persona? i.e various key values like personality_traits, values_and_beliefs, goals_and_aspirations, skills, interests_hobbies, health_wellness, personal_achievements, important_relationships, values_in_interactions, specific_requirements, role, context, tone, language_style, writing_style, etc are fixed OR can it be completely customisable as per our needs.

  2. When once we get the persona data , should we embed it in the example format(JSONL) provided in the link(https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset) for fine-tuning in the below format.
    {
    role: ‘system’,
    content: ‘#{sjson}’
    },
    {
    role: ‘user’,
    content: ‘#{ujson}’
    }

Where, sjson, should have the format which has personality_traits, values_and_beliefs, goals_and_aspirations etc which are key values of a persona. Is this correct?

  1. There is a mention in the OpenAI fine tuning page that there should be 100 examples. So should we repeat the same persona for 100 responses just as sarcastic example response given in the link(https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset) . Please confirm my understanding.
  2. If some specialised information is to be added, as the model is generic, how should we add this in the assistant message content.
1 Like

Nice questions, being just a beginner , please tell me :
Does our chat history with GPT, use of coding or APIs , fine tuning makes any modifications in LLM data ? If so is it ethical or legal ?

Hi @rao.ranganaths

  1. Customizability: The JSON structure with keys like personality_traits isn’t the standard format for fine-tuning a model—JSONL is. My post from over a year ago referred to customizing a single chat thread rather than fine-tuning the model itself. The format I shared was meant for structuring context, not training. Fine-tuning itself is flexible—you can customize the data format as needed, as long as it’s consistent and relevant to the persona.

  2. Embedding Persona Data: Fine-tuning a model uses JSONL, where each line represents an interaction. If you want to include persona details, you could place them in the system message as context. However, this would be more for providing context rather than direct fine-tuning the persona into the model. Fine-tuning would focus on behavior patterns over many examples.

  3. 100 Examples Requirement: OpenAI recommends at least 100 varied examples for fine-tuning to ensure effective learning. Repeating the exact persona isn’t recommended—use multiple examples that align with the desired style or behavior.

  4. Specialized Information: This can be done by adding specific responses or interactions that demonstrate how the model should process or respond to that specialized content. For instance, include multiple examples where the system message provides unique guidance based on the specialized information, and the user message interacts with it. This way, the model learns to adapt its responses based on that context.

Hope this helps!

1 Like

Hi @bbhusari

The short answer is no, your interactions, coding, API use, or even customizing chat threads with GPT do not modify or change the underlying large language model (LLM) itself.

Here’s a breakdown of each aspect:

  1. Chat History and Interactions: Your chat history with GPT helps tailor responses to your ongoing conversations but does not alter the model’s training data. This customization is limited to your current session and any context or preferences you might want remembered (like specific preferences in responses), but it doesn’t change the model’s core data or capabilities.
  2. Use of Coding or APIs: When you use GPT’s API, you are essentially querying the model with your inputs. The API provides responses based on pre-existing knowledge and training. The interactions through the API don’t alter the model’s training data or behavior outside of that session. Data privacy and security measures ensure that user interactions do not feed back into the model’s training without explicit permissions.
  3. Fine-Tuning: Fine-tuning an LLM involves training a base model with additional specific data to make it perform better on particular tasks. However, when you customize interactions in a single session or adjust the context for a specific conversation, it is not the same as fine-tuning. True fine-tuning happens in a controlled environment and requires a separate training process using a custom dataset.
  4. Ethical and Legal Considerations: Ethical and legal concerns typically arise if user data is used for training or improving a model without proper consent. OpenAI has a strict data usage policies to protect user privacy and ensure that data from interactions isn’t used to retrain the model unless explicitly stated and agreed upon by the user.

Regular use of GPT (like through chats or using APIs) does not alter the LLM’s training data or structure. It’s only in special cases of explicit fine-tuning (often outside the standard user interaction scope) that a model might be modified, which would require clear consent and adherence to data privacy regulations.

1 Like

Thank you @michael23 for the detailed response. Based on the response

  1. " However, this would be more for providing context rather than direct fine-tuning the persona into the model. Fine-tuning would focus on behavior patterns over many examples."
    So, fine-tuning is acting doubly in defining the behavious and providing the context. Please confirm.
  2. " For instance, include multiple examples where the system message provides unique guidance based on the specialized information, and the user message interacts with it."
    Does this require all possible responses for different combinations of persona components in JSONL? Please confirm.
1 Like

Thanks a lot for such a nice reply

In addition to the above queries, I have a couple more below

  1. How to fine-tune an LLM for a multiple personas. Is it just mixing of all the examples from different personas with corresponding responses?

  2. If we are to provide specialised content about products and if there are 100s of combination of parts of a persona(Age, profession, geography, etc based or combination of these) which leads to say recommendation of a product. Should we list out all such examples? If that is so what is the expectation from LLM.

    Thanks once again for your time.

Hello, request clarification on the above 4 queries.

Hello @michael23 ,

I tried your suggestion of using the context details in the, sjson content in the below format
“content”: [ {“Domain”:“The literature belongs to the domain of Computer Science”}, {“Exam”:“Librarian”} ]

But got error, Invalid file format. Line 1, message 1: Unable to extract tag using discriminator ‘type’

When I changed this to
“content”: “The Exam is for librarian and the literature belong to the domain of Computer Science” } It worked. Any observations on this?