can preference tuning be used to encourage longer answers (say 10+ pages) and better use of output context ?