Agreed, Wikipedia querying would be FASCINATING. Personally, I’m not sure how I’d use it in a product yet, but I have found myself using GPT-3 to search for answers to questions that I know Google won’t successfully answer.
Actually I was totally wrong when I said my corpus would eventually exceed 1GB. I was thinking of MB. I have 3 MB so far - and I’ll never come closing to 1 GB in my corpus.
If q and a based on Wikipedia is a common use case we may consider storing an up to date version of Wikipedia, accessible to anyone.
That sounds so cool, could be really useful too. I imagine OpenAI already has the data hosted somewhere for training purposes anyway, right? Would simply be a matter of building it into the API.
just read thru this thread and I too find this to be a fascinating space. are you able to share any updates on this since June?
These are some pretty interesting ideas.
Imo the most general solution would be to make things a little more human. If you can’t just shove an entire article down it’s throat, then give it a few heuristics to narrow down potentially useful passages. Have GPT output a list of potentially relevant keywords, then start with the chunk of text containing the most matches.
And if latency is an issue, then maybe the solution is to teach your bot to predict potential questions ahead of time & start pre-fetching potential answers beforehand.