This is my usage scenario:
- Input a very long article T (hundreds or thousands of pages)
- Ask a question Q1 regarding T (e.g. what happened to man X in the article T? Is that right that X did something Y?)
- Ask the next questions Q2, Q3, … to GPT3 WITH keeping T in GPT3’s mind. We don’t predict what Q(n+1) will be until we get the answer of Q(n).
As far as I know about OpenAI API, I guess that I should input the article T every time when I want to get the answers of Q2, Q3 and more. For getting answers to every question(Q2,Q3,…), inputting T everytime is performance-inefficient as well as the cost.
Is there anything in OpenAI API that stores the intermediate state S just after inputting the article T only, then reuse(or load) the state S everytime to append Q1 only and get the answer? If not, how can I do a workaround to do it? (In RNN, S might be equivalent to the hidden vector just after inputting the article T.)
In OpenAI Quickchat, it looks like there is a “Knowledgebase base” feature which corresponds to inputting T. It might be related to my question.