Does anyone have any more thoughts on this, please?
For example:
- Should the training prompt be the interview transcript alone?
- Or should it be preceded by the kind of writing instructions I would issue?
eg. “The following text is a transcript of an interview. Write a 650-word article, the focus should be communicating the views and insights of the interviewee. 66% of the article should be direct quotes. In all other material, remain objective, do not positively endorse the interviewee’s viewpoints, do not use a “conclusion”. Add a smart headline and use Markdown ## for sub-headings…”
- Is fine-tuning subject to max-token limits? ie Feeding an interview transcript as a prompt, that is a lot longer than the small snippets I see most people using for training. And the resulting article at 650+ words is longer than your standard custom entity classifier, too.
Keen to learn more about how to take this forward.