It’s very common that the output tokens remain significantly below the 4096 output token limit. ~800/900 words tend to be on the upper end of what the model returns in one API call. None of your proposed actions will change that.
@_j recently made a few good posts about the issue but I struggle to find them right now.
Edit: Here is one of the posts that speaks to that: