I have verified that with streaming, it appears that using gpt-3.5-turbo with n > 1 will cut off all the completions to shortest length completion (to the exact same number of tokens).
All the indexes give the
finish_reason to be
stop as expected, but it leaves many of the completions mid sentence.
Has anyone else seen this?