When using my plugin, ChatGPT usually needs to generate and send a significant number of tokens with each request. Currently, my service can only start working on a request once all of the tokens have been generated and the request has been sent.
The result is my plugin seems much slower than is strictly necessary. The user experience would be dramatically improved if the request was initiated immediately, and the request body was streamed in as each token is generated, similar to using stream: true in the completion API.
Is anyone from OpenAI able to say whether this is likely to be supported in future?
Respectfully, I’m puzzled by the tone of your reply. I suggest updating the first post of the “read this before posting” thread. I did see the first post there, but the part you mentioned is buried quite far down the page. I certainly do not expect any replies within a particular timeframe, and I’m sorry if my post implied that. I mainly posted this here to see if anyone else is interested in the same feature.
I would not put it that way.
Over time I just learned were the OpenAI staff make information public and check it daily.
If you would like such places please ask as new topic and I will reply.
I am never given any inside information, when the public sees it, I first see it.
However I am given some perks for being a frequent contributor to this site such as being granted moderator status and given the ChatGPT plugin developer options. But I suspect part of the reason I was given moderator is because I have been a Discourse admin, which has even more rights, on another site for a few years and a daily visitor of the Discourse Meta site so keep up to date with Discourse and known to the Discourse staff, some of whom are on this site.
Sorry if it seemed a bit harsh but over the years learned that if I give an inch with new users some will troll you into a debate. Being firm and direct upfront sets the level higher and it is easier to lower the level than raise the level.