Responses at chat.openai.com are streamed through network word by word. It’s obvious that response is already complete somewhere on the server side. Slow printing of response is just so time consuming.
Is there a way how to switch to normal “chatting” mode with instant messages?
I guess it will bring some better performance as well.
My understanding is that the response is not already complete on the server side, but that the nature of the algorithm is such that it’s generating those response tokens in real time.
See this Stack Exchange question for more: conversational interface - Are there any UX reasons for ChatGPT staggering replies word by word? - User Experience Stack Exchange