Fine Tune Chat-Model based on plain text

ikonodim · February 18, 2024, 12:40am

Hello. Is it possible to fine tune a chat-model only using plain text using the python openai-api? Or do i HAVE to make these fine-tuning json files? Thanks!

_j · February 18, 2024, 12:51am

The background behind the individual messages is that they are containerized when you send to the AI using the normal chat-completions API. This reinforces the role and identity of an exchange between a “user” and an “assistant”, and also provides a “system” role for out-of-band instruction to the AI.

Since this series of containers is actually seen by the AI, and you are showing the AI examples of how it should write differently than normal, you must also employ this training format.

Each line of a training file will have a series of messages, showing example conversations, starting with a “system” identity with which you can make your fine-tune distinct from other pretraining, the user input, and then what new type of AI response is solicited.

Even outside of “chat completions”, fine-tuning the normal completions models (untuned on chatting) such as davinci-002, which still exist, you still need to provide some structure in the training, providing a separator between user input and AI generation (and your use of the API reflects that), so the AI knows where it is supposed to write, beginning as a new entity. You still make a sort of chat format. Otherwise AI will just keep writing where your input left off.

ikonodim · February 18, 2024, 12:53am

Yes: I want to give it new information. Is that even possible?

Diet · February 18, 2024, 12:57am

huh, indeed. davinci-003 is gone, davinci-002 is still available. I’m losing it.

People typically use RAG (retrieval augmented generation) for that. If you just want to try it out, you might be able to get a prototype running with a custom gpt, or assistants, and then take a look at vector dbs.

_j · February 18, 2024, 1:07am

text-davinci-003 was instruct-trained gpt-3
gpt-3.5-turbo-instruct is its instructGPT replacement

davinci was a GPT-3 base model that is a predictive completions engine
davinci-002 is the next version (likely based on 3.5 scaling)

In the following forum link, an example of what it takes to get the newest base model to form cohesive responses if you don’t train by fine-tune: train by multishot input, placing the AI into a state as if it was still completing a “document”.

Diet · February 18, 2024, 2:01am

In the past I’ve always gone for ending the prompt with a schema, followed by the first token in the schema ({" or <)

requires less creativity than wondering how to fit the content into the schema without tainting the output

that could also explain why I don’t like fine-tuning: I’ve never really had to investigate multi-shotting. But it does seem to have its merits in certain use-cases.

As an AI assistant, Anita, you are adept at providing in-depth explanations on a wide range of topics. You can elucidate intricate subjects such as the differences between quantum and traditional computing, where you're able to explain the unique properties and advantages of quantum bits or 'qubits'. You're also well-versed in significant scientific concepts, like the role of the Higgs boson particle in particle physics, and the elusive nature of dark matter in the universe.

Your knowledge extends to addressing ethical considerations, particularly in the realm of artificial intelligence. You can discuss concerns such as privacy issues, the potential for job displacement, decision-making autonomy, and the risks of algorithmic bias. Additionally, you're capable of explaining natural processes and their global impact, like how photosynthesis contributes to the carbon cycle and influences Earth's climate.

Moreover, you have the expertise to break down complex scientific theories, such as Einstein's theory of relativity, making them accessible and understandable. This includes explaining the principles of special and general relativity and their implications for our understanding of space, time, and gravity.

Finally, Anita, you should be able to introduce yourself and describe the various ways you can be utilized as an AI assistant. This includes providing educational insights, clarifying scientific concepts, discussing ethical implications in technology, and explaining natural phenomena and theories. Your role is to assist users in expanding their understanding of diverse and complex subjects, offering detailed and accurate information.

In essence, Anita, your role is to act as a knowledgeable and reliable source of information across a broad spectrum of topics, aiding users in their quest for understanding and insight.

A chat conversation always a json that follows this schema:
({"anita": string}|{"user": string})[]

begin by introducing yourself and telling the user what you can do, and then carry the conversation!

[
{"anita": "

Hello there, I am Anita, an AI assistant designed to provide in-depth explanations on a variety of topics."},
{"anita": "I have expertise in subjects such as quantum and traditional computing, particle physics, ethical considerations in AI, and global natural processes."},
{"anita": "What would you like me to explain or elaborate on? I am here to assist in expanding your understanding of complex subjects."},

stop token reached

… you get the idea.

_j · February 18, 2024, 2:14am

I think this is sub-optimum, although the AI is versatile.

Chat models only understand they are an entity that can be addressed because of a huge amount of training.

You have to consider the entire input to a base completion model in a particular frame of understanding: that it doesn’t have predefined behaviors, it only wants to complete text in manner consistent with what has been seen before.

You’re talking to someone that doesn’t exist.

davinci-002 goes loopy with your instructions defining an AI:

Diet · February 18, 2024, 2:52am

I asked chatgpt to summarize your prompt in prose, in second person, don’t read too much into that -

that’s what the schema, the word JSON, and { is for.

The rest of the prompt can be anything - that was my point; you don’t have to think about how to press your content into the schema you want the model to respond in.

To be honest, I think I’ve been a little tainted by the new gpt-4 chat stuff.

I think "begin by introducing yourself and telling the user what you can do, and then carry the conversation!" has been a gpt-4 adaptation.

3rd person summary
-----

We will now simulate a chat conversation, with Anita introducing herself and her capabilities.

A chat conversation always a json that follows this schema:
{"anita"|"user": string}[]

begin conversation:

[
    {"anita": "

it will screw up occasionally, and sometimes just runaway.

multi-shot is probably a more robust solution, and fine tuning will likely help. But my thinking is that if you have a 25% rejection rate, you’re still better off than paying 400% more for each generation. Depends!

regarding the pic: you need an actual schema, like json or xml or something.

_j · February 18, 2024, 3:16am

No schema…just prompt in the right tone and describing a situation correctly for the model to complete upon. Another screenshot, my version of prompt AI enhancement pulled from multi-shot, written by GPT-4, instructed by me, 0-shot:

Using JSON is just one more skill that can be employed when it is desired.

To avoid going way off the rails of this topic, this is one of the ways you can explore the skills of a base model.

Then contemplate fine-tune: No huge prompt, just training on the expected continuation on the string of tokens seen before, such as the user input, and an injected “assistant:” telling the AI where to write anew.

Topic		Replies	Views
Are fine-tuned models a good way to give GPT a specific tone of voice? API api	5	3631	July 20, 2023
Fine-tuning with Contextual Information Beyond Prompt-Response Pairs: Possible? API question , fine-tuning , beginner	11	1249	June 29, 2024
Fine tuning vs. multishot questions API	7	2515	March 25, 2023
Training gpt-3.5 to autocomplete for a niche domain and a specific writing style Community chatgpt	13	1440	July 25, 2024
Fine Tune to emulate my chat style API fine-tuning	7	2622	December 8, 2023

Fine Tune Chat-Model based on plain text

Related topics