Is there any way in streaming, to find the percentage of the response we received out of the total response that gpt-4 will give.
For example if i prompt Hi there!, Let’s assume GPT-4 is likely to respond “Hi there I’m an Ai assistant developed by openAi And how can help you today?”
So while streaming this i would like to know what is the percentage of response is received considering the entire response as 100%?
Is this feature already supported? If yes can anyone give me the reference? If not is there any input to implement this?
When you start speaking, let’s say I ask you to describe your last breakfast in detail, do you know ahead of time how many words you will use, or how long that description will take?
In other words, how is the model (or anyone, apart from the prompter) supposed to know how long the output is going to be?
I see! I get your point here.
The reason I’m asking this question is while streaming the response sometimes I see response will be stuck for longer time which can give bad user experience.
So i would like to buffer the certain amount of stream response before actually displaying it to user.
Since I will not know how much of data I’m expecting or what is current percentage of data that I’ve received, it would be difficult to identify how much data I need to buffer at client side.