Hello, recently I’m going to change my request from a non-stream to a stream request, similar to the approach used by Web ChatGPT.
How does the speed difference between a non-stream Vs stream? is there any significant difference?
actually did a small test and it looks like the difference seems minor, but I’d be so glad if you guys maybe have some extensive tests or benchmark and can share some of your results here