Fine tuned model's response is not lengthy/detailed

I have trained chatGPT model 3.5 turbo 1106 model with few prompts. The response from the model is as expected, but when the response is not detailed/elaborated as the original model(gpt-4 or 3.5). I tried changing the parameters like temperature and top_p, but nothing seems to be changing.
For eg, if I ask the fine-tuned model to write an email it does give a short response, but when used with the default model it gives a detailed response. Is there any way to slove this.

1 Like

Hi @Dev1000 - could you please elaborate a bit on how you finetuned your model? That will help to identify possible root causes.

1 Like

When you fine tune a model, you are making its responses more granular. So, based on how you have fine tuned it, you will get a detailed response.

From my view, the fine tuned model is working as expected.

The way to fix this would be to fix how your prompt/completions are created in the first place. You should ideally add more content to your completions to give the model an indication that you want more data back. Once you have fine tuned it, usually changing temp does nothing to it.

So my fine-tuning file is somthing like this:

{"messages": [{"role": "user", "content": "All persons begin with tag <PER> and end with tag </PER>"}, {"role": "assistant", "content": "OK. I will identify person with tags <PER> and </PER>"}]}
{"messages": [{"role": "user", "content": "Extract person names enclosed within '<PER>' tag and '</PER>' For example Can you find person names? - <PER>teghwjw ugiqfhfv</PER> received a promotion at work for his outstanding performance.Yes, I can find person names within the provided text. In this case, the person's name appears to be <PER>teghwjw ugiqfhfv</PER>."}, {"role": "assistant", "content": "Yes, I can find person names within the provided text. In this case, the person's name appears to be <PER>teghwjw ugiqfhfv</PER>"}]}

where my data will be masked using some value random and enclosed in a XML tags depending upon the entity(if the data is person name a random value will replace the data enclosed in xml tag). I have instructed in the file indicating that for eg asdqeqwe123 is person name(for almost 10 different tags), and I have given a few examples for each tag.

Focusing purely on the issue with the length that you have pointed out, this is most likely due to the nature of the training data you have used for fine-tuning. Looking at the examples, they are relatively short in nature. Additionally, there is nothing in your prompts indicating that the assistant is supposed to provide a more detailed response.

Hence, if you’d want a more detailed response, then it is best to (a) include examples in your fine-tuning data set that are reflective of your desired target length and (b) to expand your existing prompt by describing in greater detail the style of output you are looking for.

1 Like