i believe the Stream = True, allow us to send multiple call, as 1 prompt.
currently GPT 3.5Turbo only allowed Context in prompt. with 4096 single limit,
we quickly run into limit as the conversation continue ( if you want GPT to aware the conversation history )
My idea :
1st call = context
2nd call = bot instruction, limitation
3rd and Last call = summarize of previous converstation. ( limited to last 2 hours )
and get response from OpenAI.
But i have a problem.
I m not sure how to use data:[DONE] in my call.
Hopefully any senior coder here could give this stream = true a try. and share with me how to end the stream. Thanks in advance.
would like to seek your advice on summarize conversation, and compile into next API call.
call GPT3.5-turbo, pass only the conversation history to summarize it.
save the summary into my DB - chat_session > summary_so_far
next call, include the summary as context.
– only send summarize request on every 5th step ( count the user input entry )
So far in my test, I creating a sales agent, with context also include most FAQ.
even without knowing previous chat, the agent still handle well.
so, I guess do the summary call every 5th step would be enough. and save some token.