Fine-tuning with forum post data

danielss89 · September 13, 2023, 9:19am

Hello

I have a RPG website and want to use openAI to create a bot which can role play with the users.
I have a forum with around 30.000 topics with a million posts.
Is it possible to finetune the gpt3.5turbo model with this data, and does it even make sense? It should be so that the model is better at roleplaying like the users.
This could be a promp i would send(and then i would send each user post after too):
"You are an NPC in a quest on a text based role play game about harry potter. A task consists of a series of role playing forum posts. The user role plays with you.
This task is that the user would have to find an ingredient for a potion at the greenhouses next to the castle.
Begin the first post with describing how the user arrives to empty greenhouse. After a post or two from the user, a teacher should arrive, you are the teacher. You should try to figure out what the user is doing there. The user is not allowed to steal the ingredient, but if they make a good excuse, you let them go. If they wrote that they took the ingredients the task was solved, and the last reply you write should be “–!!!TASK SOLVED!!!–” and if the user doesn’t get the ingredient you should write “–!!!TASK FAILED!!!–”. This task should ideally consist of 5-15 posts. The users name is Valentine and you, as the teacher is named Melanie. You will write the first post(and wait for reply), and i will send you the users answer so you can continue with next post. Your reply should not consist of a headline, but be a text i can post directy. 3rd person past tense. The posts should be 4-8 lines. "

_j · September 13, 2023, 9:26am

Fine-tuning with a forum’s messages makes no sense for making a role-play chatbot.

Also, have you obtained consent to have user’s data sent to OpenAI to make into your own products?

You should read more about the example conversation exchanges that show an AI how to behave in the fine-tuning documentation. That will let you see that you are showing the AI the behavior when it receives a particular type of question when you fine-tune.

danielss89 · September 13, 2023, 10:04am

Ok, i thought it would help it in how to format the role play responses.
Yes I have consent to use the data.
Is there another better option for me, or should I just use the models as is?

ons. 13. sep. 2023 kl. 11.36 skrev j via OpenAI Developer Forum <notifications@openai1.discoursemail.com>:

_j · September 13, 2023, 10:09am

I don’t understand how it would help…

AI narrator: I am your host and navigator on an amazing adventure. What will you do next, warrior?

User: I use my spell of revealing to find out how to fine tune an AI model.

AI narrator: Out of the sky, a voice appears: “Fine-tuning with a forum’s messages makes no sense for making a role-play chatbot.”

User: I pay the merchant 10 gold pieces for the sword of healing.

AI narrator: If you have problems with your credit card, you can look to see if it is one of their supported countries.

danielss89 · September 13, 2023, 12:01pm

Well, a topic goes more like this: gist github com/Danielss89/9501b0732716f604b11cfecfbac56834

Imagine the AI being one of the users then.

Foxalabs · September 13, 2023, 12:07pm

Have you tried the exact prompt you describe? you could generalise it with place holders for the players name and the NPC’s name, seems like something that might work with some trail and error and iterative testing.

You could always split the stages up and have then controlled by traditional software with the AI told to give a specific response that you can check for should the player complete that stages task.

danielss89 · September 13, 2023, 12:08pm

Oh yes, i have tried it. And it actually works OK.
I just thought i could make it work better if i had 700.000 examples of role play it could “draw inspiration from”

Foxalabs · September 13, 2023, 12:14pm

Fine tuning would teach it more about the style of the written text, rather than the content, but you could certainly try it with a limited subset, it might get to understand the relationships between the puzzles and the user, won’t really know until it’s been tried.

I think you may also get some use from embeddings, being able to lookup relevant text from your large corpus may add a lot of useful context, especially if you tell the model to play a roll and to use the retrieved text as inspiration.

danielss89 · September 13, 2023, 12:24pm

Ok, so my question is, how should i format the data?
As i read in the docs, i should format it to have a prompt and response. But i don’t really have the prompts when it is forum role play. Or should i make the prompt something like:
Prompt: Answer this topic reply: “TOPIC REPLY HERE”
Response: “NEXT TOPIC REPLY HERE”

Foxalabs · September 13, 2023, 12:34pm

Ahh, well, the standard method is to just leave the “prompt” part blank, in this case the “user” roll can have the contents of “” (nothing) and then fill in the assistant roll with around 1000 tokens worth of your source material, you may find that including 250 tokens worth of the prior chunk and 250 tokens worth of the next chunk along with 500 tokens worth of new data will allow for cross chunk retrieval. Think of it as a sliding window of text that has a pre and post section of 100-250 tokens.

danielss89 · September 13, 2023, 12:39pm

So much for just doing an easy database dump haha
Ok, thanks, that gives me something to try out

General0rder · September 20, 2023, 3:46am

You could have your database first scrubbed for high quality examples with gpt and then use those examples to create synthetic data in the style you want to fine tune the model. Check out the “textbooks is all you need”paper if you haven’t already.

Topic		Replies	Views
Question about fine tuning Prompting	8	783	December 24, 2023
Are fine-tuned models a good way to give GPT a specific tone of voice? API api	5	2189	July 20, 2023
Own model fine tuning for communication platform API chatgpt , fine-tuning	16	1380	December 24, 2023
Finetuning for shortening prompts Documentation fine-tuning	10	2634	December 24, 2023
Can I train it to write like me? How? API	13	4002	October 21, 2023

Fine-tuning with forum post data

Related Topics