A single attention heads output

I am trying to understand the attention mechanism and have some basic questions.

  1. Is the output of one single attention head a vector or a scalar? (or even matrix?)

  2. There is a softmax function over a matrix in the attention head. Is the output of this softmax function a vector or scalar? Over which dimension goes the softmax, if the output is a vector?