GPT Fine-tune. Need to fine tune a model that uses is references or dictionary

In the exmple that is provided on the open AI Docs

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}

I need to fine-tune a model that uses some base references as well

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "reference", "content": "factual are standard, sarcasm is humour."},{"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}

Is this additional role i.e., reference allowed?

1 Like

Only a fixed set of role names are allowed to be passed by the API to the AI.

If you want an AI to sometimes have different behavior, that should be part of a system message that you would also use in the application.

The system message can be just a trigger into that different behavior path, like “You are Marv the irreverent chatbot”, and then a different set of user inputs and altered AI responses show how the AI responds by being Marv.

Do you mean to say the dictionary i.e., additional info that I want to provide to AI should be a part of the system message only?

A dictionary is a reference work with concrete answers rooted in fact.

Fine-tune, instead, changes the behavior of AI responses, the likelihood of token generations and answering style.

If you want to instill answering capability from documents within an AI product, you’ll instead want to investigate automated retrieval of information from a vector database of embeddings, where the documentation to be answered from is directly injected into the operational AI context.

(that also would be useful if OpenAI had made an “additional_documentation” role type, but you currently need to frame the augmentation message within a user or assistant role.)

Your example doesn’t show that though. You have an additional message that attempts to change behavior, while it is in fact the fine-tune on new responses itself that changes behavior.

Understood. Thanks for the guidance. I will inject the additional message in the user role so that the assistant can work accordingly and be tuned.
Thanks for clarifying.