Positional encoding and changed order of input tokens

I have a simple question for scientific reasons about the positional encoding:

Is it true that without positional encoding, one can change the order of the tokens in the input sequence and one gets the same answer from the attention head?

Or is the position of the input token still present in the output matrix of the attention head, simply by the row number of the output matrix (or column number)?