'Side task' fine tuning and target task benefit?

Suppose my target task is to use an LLM to generate certain kinds of emails. These emails would have between 10 and 16 ‘tags’, which you can consider as a ‘multi-label’ problem. So the ultimate scenario is that I provide a few sentences describing my needs, the present these ‘tags’, the LLM produces the email.

We have a lot of such emails (in thousands) and their tags already.

I have experimented with simple prompting and few-shot learning, but the result is not great. In most cases, it appears that the LLM does not understand the meaning of some of the tags.

I am thinking, since I already have thousands of emails with such ‘tags’, I can fine-tune the LLM on a multi-label classification task using the emails, and their ‘tags’ as labels, thus producing my own LLM-customized model. Let’s call this a ‘side task’.

So my question is, if i fine tuning the LLM on this side task then use the fine-tuned model on my target task, will it do better? The idea is that through the side task, the LLM ‘learns’ the meanings of the tags and how they ‘map’ to words.

Thanks

The only way to be sure is to try it, to get a realistic idea of performance you will typically need a few thousand examples.

1 Like

I’d recommend splitting the training data into two halves:

  1. For email → category
  2. For a subject with a category → email
2 Likes