Yep, that’s just context attention quality. The ability for the masking to seem to be reading through the document while exposing the instructions. The predicting of a fruit word and the predicting of when there are no fruit words remaining and thus a ".
If you want to pay a lot, and still chunk to typical output size, a completely different technique:
“Repeat this back to me without any changes to the prose. When you encounter a fruit in your response, add @@@ after the word or phrase describing a fruit. When you encounter a vegetable in your response, add !!! after the word or phrase describing a vegetable.”