Fine-Tuning to Avoid Scary Responses (Negative Reward)

N2U, you’re actually a genius. This can certainly be rephrased as a classification problem. Thank a ton!

1 Like