You need to break this into multiple steps. First step is to ask if the answer to the query is present (True/False) and only if true, then you proceed to extract the answer.
This method is demonstrated in my reduce confabulation video
You need to break this into multiple steps. First step is to ask if the answer to the query is present (True/False) and only if true, then you proceed to extract the answer.
This method is demonstrated in my reduce confabulation video
You will need a cognitive architecture to achieve that. It requires integrating several steps, such as asking the right questions, performing search, and integrating it into a corpus. My book Natural Language Cognitive Architecture outlines how to do this.
I saw that @daveshapautomator earlier when we exchanged posts on Finetuning. In this case we are not using Finetuning but embeddings (per your suggestion). Our embeddings come from 5000 documents. So we create embedding once and reuse that (as opposed to creating embedding with each query).
Our prompt specifically says that if a questions can’t be answered then say “Unknown”. It works many times times. But it fails spectacularly as shown in my original example.
Are you saying that we despite of using embeddings (and Unknown prompt) we do the following:
Q1: Ask if the KB has the answer (Yes/N0)
Q2: If answer is Yes then complete, else say Unknown.
If that is what you are recommending, how do we ask the bot to check the embeddings if it has the answer (Q1).
This is a hard problem and I have not tackled a task quite this big. I was working on something far larger by using Wikipedia a source of ground truth but have not worked on it for a while. So keep in mind that my next ideas are only hypothetical, but now that I know a bit more about the problem you’re working on I think my recommendations may be more accurate:
Before I realized that GPT-3 has a lot of general knowledge, I was working on using Wikipedia as a source of truth to be incorporated into my cognitive architecture. For this problem, I had to find a way to make 5 million articles rapidly searchable so I settled on SOLR as an offline search index. Since you are dealing with several orders of magnitude less, this solution should work rather well for you. Here’s the video about that:
Essentially, you break the problem down into several steps, as outlined in my book (which I recommend you read if you haven’t):
This is only hypothetically possible and I have not tried it. GPT-3 is capable of storing quite a lot of information, so it’s entirely possible that you can finetune a model on your 5000 documents and just use that to spit out the correct facts. Such a model would be (1) prohibitively expensive to train and (2) would require quite a bit of experimentation to determine if it’s accurate and viable.
Once I cycle back to research mode, this may be one of my projects. Many, many, many people have a need to search an arbitrarily large knowledge base for facts and figures, and to be able to rely on it. Indeed, even my cognitive architecture would benefit from such a feature. Since OpenAI has now enabled the ability to continuously finetune a model, perhaps this method would not be so unwieldy. Essentially, as your KB grows, you just have incremental training sessions to integrate new information into your model.
As I mentioned, I will be experimenting with this… eventually. It would solve many problems to be able to simply accumulate all of an ACOG’s memories in one model that has the magical ability of instant recall. However, I am skeptical of relying on blackboxes such as this for critical functionality. For example, imagine you have an autonomous agent at some point in the future - you want all of its memories to be explicit and declarative (X happened at Y time) and not just embedded in a model. Kind of like how Tesla’s must keep sensor logs in case of a car crash.
reading your book. Thanks for the recommendation. I will try that.
I think then OpenAI should update documentation for Embeddings too @daveshapautomator . Because the current documentation makes one believe that the answer is coming from the KB. We know at this point that either there is a bug or the documentation needs revision. adding @moderators for any follow up.
Great stuff @daveshapautomator.
Based on my own experience, even a fine-tuned Davinci will not be able to answer questions presented in the fine-tuned file, unless it was written multiple times and answered exactly the same way each time. In order to fine-tune a model to answer correctly, there is a massive amount of work to multiply every question again and again, and yet even if temperature = 0, it might invent stuff based on the atmosphere of the answer rather than THE answer.
I really like your idea about checking whether or not the information is there in the middle step. Nevertheless, it requires an additional model just for that and quite a complex architecture. That’s where all the fun is, isn’t it? ![]()
Which documentation are you referring to? I may be misunderstanding and telling you lies by mistake ![]()
There are many ways to skin this cat. I suspect that accumulating memories/KB/documents/logs in a search index is probably the way to go. SOLR can search millions of documents in a fraction of a second - plenty fast enough for 99% of use cases.
Actually, that reminds me, I found Milvus but haven’t used it yet. This may be the correct way to go: https://milvus.io/
If someone just figures out vector search + question answering, that alone would be a billion-dollar business. This is the way of the future.
Here. Specifically “Text search using embeddings” section. When a question is outside the scope, it should not return any document. But it does. Maybe there is some confidence score that gets returned too and if thats the case, we should just use a “low confidence score” as boolean Yes/No. Thoughts?
OH, I see.
I try to avoid blackbox things like that since I have no idea how it works. That’s why I never really used the now-deprecated Answers endpoint.
Personally, I would just story all the KBs and their associated embeddings in a local DB (with 5000, you can easily do this in SQLITE or even just a JSON document). Then when you have a search query, get the embedding for that and do a dot product against all 5000 documents. It will take less than a second and you can just sort by highest dot product.
In my ACOG when I am searching for relevant memories I just grab the 5 or 10 most relevant memories.
Thank you. Thank you. Thank you. Such help means a lot when you have told your employer that you would be leaving and working on a new venture while having no clue how the product works. Plus you have to pay for your kids college fee.
Why would you pick 5 or 10 top and not the first one?
Human memories are squishy. They tend to get compressed over time. This is called consolidation which happens in the background and while we sleep. Alcohol, for instance, disrupts memory consolidation, which reduces learning.
So you will try to answer using all these 5-10 top results? If yes then which one will show you to user?
I’m talking about an artificial cognition, not a user facing application. I’m merely explaining why I recall the top memories in my ACOG. For your chatbot, it may or may not make sense to do the same.
I made some more changes in the prompt, as seen below. And now I am getting “Unknown”, as expected. I asked questions such as what is 2+2, what is the capital of USA, What is Node.js, etc and so far it is working.
I am an answering bot with limited knowledge base about a series of web services. I have been trained on a context and if you ask me a question, I will use the provided context to give you the answer. If you ask me a question that is not mentioned in the context , I will respond with “Unknown”. For example, if you ask me how many days in a week and the knowledge base I am trained on does not have this information, I will respond “Unknown”.
I love this prompt. There should be a prompt “bank” for such situations.
I am watching your Roe v Wade video. From the answers you are getting, it looks like that the AI is also taking data from outside of the verdict. is that correct?
I did not give it any other data sources, no