Fine tuned model's response is not lengthy/detailed

Dev1000 · February 23, 2024, 10:23am

I have trained chatGPT model 3.5 turbo 1106 model with few prompts. The response from the model is as expected, but when the response is not detailed/elaborated as the original model(gpt-4 or 3.5). I tried changing the parameters like temperature and top_p, but nothing seems to be changing.
For eg, if I ask the fine-tuned model to write an email it does give a short response, but when used with the default model it gives a detailed response. Is there any way to slove this.

jr.2509 · February 23, 2024, 11:02am

Hi @Dev1000 - could you please elaborate a bit on how you finetuned your model? That will help to identify possible root causes.

idonotwritecode · February 23, 2024, 11:31am

When you fine tune a model, you are making its responses more granular. So, based on how you have fine tuned it, you will get a detailed response.

From my view, the fine tuned model is working as expected.

The way to fix this would be to fix how your prompt/completions are created in the first place. You should ideally add more content to your completions to give the model an indication that you want more data back. Once you have fine tuned it, usually changing temp does nothing to it.

Dev1000 · February 23, 2024, 1:09pm

So my fine-tuning file is somthing like this:

{"messages": [{"role": "user", "content": "All persons begin with tag <PER> and end with tag </PER>"}, {"role": "assistant", "content": "OK. I will identify person with tags <PER> and </PER>"}]}
{"messages": [{"role": "user", "content": "Extract person names enclosed within '<PER>' tag and '</PER>' For example Can you find person names? - <PER>teghwjw ugiqfhfv</PER> received a promotion at work for his outstanding performance.Yes, I can find person names within the provided text. In this case, the person's name appears to be <PER>teghwjw ugiqfhfv</PER>."}, {"role": "assistant", "content": "Yes, I can find person names within the provided text. In this case, the person's name appears to be <PER>teghwjw ugiqfhfv</PER>"}]}

where my data will be masked using some value random and enclosed in a XML tags depending upon the entity(if the data is person name a random value will replace the data enclosed in xml tag). I have instructed in the file indicating that for eg asdqeqwe123 is person name(for almost 10 different tags), and I have given a few examples for each tag.

jr.2509 · February 23, 2024, 2:00pm

Focusing purely on the issue with the length that you have pointed out, this is most likely due to the nature of the training data you have used for fine-tuning. Looking at the examples, they are relatively short in nature. Additionally, there is nothing in your prompts indicating that the assistant is supposed to provide a more detailed response.

Hence, if you’d want a more detailed response, then it is best to (a) include examples in your fine-tuning data set that are reflective of your desired target length and (b) to expand your existing prompt by describing in greater detail the style of output you are looking for.

Topic		Replies	Views
GPT-3.5-Turbo - Unable to prompt engineer Fine-tuned model Prompting fine-tuning	1	808	December 9, 2023
A tool to rewrite and format a draft with company tone of voice API fine-tuning , api	5	146	October 8, 2024
Fine-Tuned Model Not Responding with Expected Answers API	2	331	November 6, 2024
Fine tuning - how exactly does it work? API	6	2598	December 23, 2023
Struggling with poor performance on fine-tuned davinci model API	15	2677	December 20, 2023

Fine tuned model's response is not lengthy/detailed

Related topics