I saw that @daveshapautomator earlier when we exchanged posts on Finetuning. In this case we are not using Finetuning but embeddings (per your suggestion). Our embeddings come from 5000 documents. So we create embedding once and reuse that (as opposed to creating embedding with each query).
Our prompt specifically says that if a questions canât be answered then say âUnknownâ. It works many times times. But it fails spectacularly as shown in my original example.
Are you saying that we despite of using embeddings (and Unknown prompt) we do the following:
Q1: Ask if the KB has the answer (Yes/N0)
Q2: If answer is Yes then complete, else say Unknown.
If that is what you are recommending, how do we ask the bot to check the embeddings if it has the answer (Q1).
This is a hard problem and I have not tackled a task quite this big. I was working on something far larger by using Wikipedia a source of ground truth but have not worked on it for a while. So keep in mind that my next ideas are only hypothetical, but now that I know a bit more about the problem youâre working on I think my recommendations may be more accurate:
Option 1: Search Index
Before I realized that GPT-3 has a lot of general knowledge, I was working on using Wikipedia as a source of truth to be incorporated into my cognitive architecture. For this problem, I had to find a way to make 5 million articles rapidly searchable so I settled on SOLR as an offline search index. Since you are dealing with several orders of magnitude less, this solution should work rather well for you. Hereâs the video about that:
Essentially, you break the problem down into several steps, as outlined in my book (which I recommend you read if you havenât):
When a query comes in, you first use GPT-3 to generate appropriate search times to fetch the correct information. This can be done with a prompt like âExtract search terms to Google the correct information for the following queryâ or something like that. GPT-3 is really great at writing Google queries.
Use said search terms to search your SOLR instance for the correct documents - this will take some experimentation and tweaking. You could instead use the embedding/dot product search method, which could hypothetically be more accurate.
Once youâve fetched the correct documents, which may still be too long to search with a single GPT-3 prompt, you will need to recursively summarize or distill them, which you can see in my âcompress anythingâ videos for the recursive summarizer. I cannot even take credit for this idea, as a commenter on my video pointed out the utility value of using recursive summarization for recall/fetch purposes.
Youâll still need a check somewhere in here to know whether or not the correct information is even present. But as Iâve demonstrated in other threads, GPT-3 is really good at just giving you a BOOLEAN answer about whether or not something is present or not.
Option 2: Finetune a KB memory bot
This is only hypothetically possible and I have not tried it. GPT-3 is capable of storing quite a lot of information, so itâs entirely possible that you can finetune a model on your 5000 documents and just use that to spit out the correct facts. Such a model would be (1) prohibitively expensive to train and (2) would require quite a bit of experimentation to determine if itâs accurate and viable.
Once I cycle back to research mode, this may be one of my projects. Many, many, many people have a need to search an arbitrarily large knowledge base for facts and figures, and to be able to rely on it. Indeed, even my cognitive architecture would benefit from such a feature. Since OpenAI has now enabled the ability to continuously finetune a model, perhaps this method would not be so unwieldy. Essentially, as your KB grows, you just have incremental training sessions to integrate new information into your model.
As I mentioned, I will be experimenting with this⌠eventually. It would solve many problems to be able to simply accumulate all of an ACOGâs memories in one model that has the magical ability of instant recall. However, I am skeptical of relying on blackboxes such as this for critical functionality. For example, imagine you have an autonomous agent at some point in the future - you want all of its memories to be explicit and declarative (X happened at Y time) and not just embedded in a model. Kind of like how Teslaâs must keep sensor logs in case of a car crash.
I think then OpenAI should update documentation for Embeddings too @daveshapautomator . Because the current documentation makes one believe that the answer is coming from the KB. We know at this point that either there is a bug or the documentation needs revision. adding @moderators for any follow up.
Great stuff @daveshapautomator.
Based on my own experience, even a fine-tuned Davinci will not be able to answer questions presented in the fine-tuned file, unless it was written multiple times and answered exactly the same way each time. In order to fine-tune a model to answer correctly, there is a massive amount of work to multiply every question again and again, and yet even if temperature = 0, it might invent stuff based on the atmosphere of the answer rather than THE answer.
I really like your idea about checking whether or not the information is there in the middle step. Nevertheless, it requires an additional model just for that and quite a complex architecture. Thatâs where all the fun is, isnât it?
There are many ways to skin this cat. I suspect that accumulating memories/KB/documents/logs in a search index is probably the way to go. SOLR can search millions of documents in a fraction of a second - plenty fast enough for 99% of use cases.
Actually, that reminds me, I found Milvus but havenât used it yet. This may be the correct way to go: https://milvus.io/
If someone just figures out vector search + question answering, that alone would be a billion-dollar business. This is the way of the future.
Here. Specifically âText search using embeddingsâ section. When a question is outside the scope, it should not return any document. But it does. Maybe there is some confidence score that gets returned too and if thats the case, we should just use a âlow confidence scoreâ as boolean Yes/No. Thoughts?
I try to avoid blackbox things like that since I have no idea how it works. Thatâs why I never really used the now-deprecated Answers endpoint.
Personally, I would just story all the KBs and their associated embeddings in a local DB (with 5000, you can easily do this in SQLITE or even just a JSON document). Then when you have a search query, get the embedding for that and do a dot product against all 5000 documents. It will take less than a second and you can just sort by highest dot product.
In my ACOG when I am searching for relevant memories I just grab the 5 or 10 most relevant memories.
Thank you. Thank you. Thank you. Such help means a lot when you have told your employer that you would be leaving and working on a new venture while having no clue how the product works. Plus you have to pay for your kids college fee.
Human memories are squishy. They tend to get compressed over time. This is called consolidation which happens in the background and while we sleep. Alcohol, for instance, disrupts memory consolidation, which reduces learning.
Iâm talking about an artificial cognition, not a user facing application. Iâm merely explaining why I recall the top memories in my ACOG. For your chatbot, it may or may not make sense to do the same.
I made some more changes in the prompt, as seen below. And now I am getting âUnknownâ, as expected. I asked questions such as what is 2+2, what is the capital of USA, What is Node.js, etc and so far it is working.
I am an answering bot with limited knowledge base about a series of web services. I have been trained on a context and if you ask me a question, I will use the provided context to give you the answer. If you ask me a question that is not mentioned in the context , I will respond with âUnknownâ. For example, if you ask me how many days in a week and the knowledge base I am trained on does not have this information, I will respond âUnknownâ.
I am watching your Roe v Wade video. From the answers you are getting, it looks like that the AI is also taking data from outside of the verdict. is that correct?
Sorry for the intrusion of the convo here, but Iâd like to get the same thing done as @chinmay.duke with less OpenAI experience on my end ;).
Any video/article of yours anywhere I can watch/read up on the âdot productâ search. Iâm trying the embeddings method as well in my journey, after having tested the soon to be deprecated Upload File / Answers method.