Fine Tuning for Style Transfer on Minimal Data Set

will4381 · April 3, 2024, 1:37pm

I want to fine tune on a small data set of just 10 full length essays ranging from 500 to 1200 words. The hope is to get a model that generates completions in my style, tone, and voice while also being undetectable by common ai detectors. Having a hard time getting results that dont hallucinate, was using logit bias in addition to the fine tune but got crazy hallucination. I get on average a 8% vocab overlap and around 8% cosine similarity between training data set and the completions. Ive recently tried using NLTK to split my data set into smaller context aware snippets to train on more data, but results have not been as good. Ideally the smaller the dataset the better, and Im not using prompt completion pairs just completions with a system message in the fine tune. Any tips or ideas for making this better and consistent?

Topic		Replies	Views
Adjusting Finetuning Hyper-parameters For Small Datasets API fine-tuning , api	0	600	April 6, 2024
Fine tuning very very poor results API fine-tuning , api	16	2888	July 11, 2023
Using fine-tuning for operational report generation API	0	449	April 15, 2023
Fine tuning - how exactly does it work? API	6	2573	December 23, 2023
Got awful results after fine-tuning API	11	3199	December 1, 2022

Fine Tuning for Style Transfer on Minimal Data Set

Related topics