I intend to fine-tune a model that would be aware of its previous mistakes. E.g.
#### Example 1
User: what's 2x2?
AI: 4.
User: Are you sure?
AI: Absolutely.
#### Example 2
User: what's 2x2?
AI: 5.
User: Are you sure?
AI: Sorry, I meant 4.
I’d like to know if the first message of the AI is considered frozen from the optimization point of view. Otherwise, with this approach, I would teach the model to say 5, when it knows it’s 4.