I have verified that with streaming, it appears that using gpt-3.5-turbo with n > 1 will cut off all the completions to shortest length completion (to the exact same number of tokens).
All the indexes give the finish_reason
to be stop
as expected, but it leaves many of the completions mid sentence.
Has anyone else seen this?