The idea of a stop_sequence
in the training data for fine tuning is to just use the same tokens or sequence of tokens at the end of every response so the model “learns” to,
- Include it in every response
- Stop talking after it does
I wouldn’t recommend using the internal end of message token for this, rather just a token or sequence of tokens which are incredible unlikely to be naturally produced by the fine-tuned model for your use case.
Some interesting choices might be,
\a
: bell\b
: backspace\f
: form feed\v
vertical tab- Etc
I’m not entirely certain how each of these would work in practice[1], it seems most people simply use some combination of some number of #
and \n
.
This is on my list of things to experiment with ↩︎