GPT-4 Streaming Output Radically Different than Static Output


I have been messing around with streaming with GPT4. Some of my testing has been incredible, but sometimes the outputs are garbage. They are radically different from gpt4 (stream=False) and chatgpt outputs. They are often completely unusable. Is it possible that certain prompts, when stream=True, just don’t play well? Is the architecture actually different in the background?


Do you have a consistent example prompt? Can’t say I’ve noticed much of a quality difference when switching between stream=True/False. ChatGPT is a bit different, since there’s system message/temperature/history changes that can result in different responses.

Also, what are you using to interact with the API (what language/package)? Perhaps there’s an issue with the code/library in how it is re-assembling the streamed responses.

1 Like

We just tried streaming too on GPT-4 and got garbage output for the same prompt. Can someone from OpenAI comment on this?


Can you please give some examples of this, and a code snippet of your API calls?