Maximum tokens limit per request, also applicable to the Assistant API?

We are using the Completions API with streaming and occasionally get the max limit of 4096. We are currently at Usage Tier 3. To make I understand the limit, let me ask a few questions:

  1. The TPM limit set in our account is applicable to all API calls within a minute right?
  2. So if we only issue one request in a minute but the combined token count of the prompt and the response exceeds 4096 then we will get the “finish_reason=length” that means we hit the limit, is that correct?
  3. Is the limit the same for the Assistant API? If so, if the response will be too long that it will exceed 4096, does it mean that for a single run we can get more then one message that collectively represents the response to our single prompt?