I’ve been working on this problem some more. I may have untangled it partly.
If you start with very short/small entries (such as individual memories, chat logs, news bytes from RSS feeds, etc) - break down your database into very atomic entries. A few sentences each, maximum.
Then you can use an index/search tool (like SOLR or ElasticSearch) to find related/relevant snippets, even if they are from very different sources. Those disparate sources can be news articles, wikipedia articles, previous conversations, PubMed papers, etc. Then, with very short snippets, you can rapidly compile them into a reasonably sized document.
Then, with that reasonably sized document, you can rely on GPT-3’s internal understanding of the world to produce good answers to any problem. (in theory, this last part may be wishful thinking on my part).
In the future, hopefully GPT-4 can ingest 20,000 tokens instead of 2,000, so you can give it larger chunks of information. Maybe GPT-5 can take in 2M tokens.
Anyways, in the meantime, I think atomic/granular entries are the way to go.