That is my assumption.
It allows the model to focus only on the precise task at hand when doing each rephrasing.
When the model processes messages it looks at all the relationships between all the tokens. That’s why the computational complexity scales with the square of the context-length.
So, even though the models are very good at focusing their attention on the relevant details of the context, all those irrelevant details will exert subtle influences on the generation.
So my expectation is you will get the best quality results more consistently by processing them individually.