Coming Soon: Natural Language Cognitive Architecture: A Prototype Artificial General Intelligence

daveshapautomator · July 22, 2021, 8:04pm

Hello everyone - I am proud and more than a bit excited to announce that my book will be published soon! This book documents my experiments and implementations of a prototype artificial general intelligence. I spent more than 10 years studying cognition and deep learning to arrive here today. The journey has had many fits and starts but with the release of GPT-2 and then GPT-3, I finally had all the underlying tools required to implement my designs.

Many of you have seen me share some of my ideas here on the OpenAI community and I have had a hard time holding back - this book contains hundreds more ideas and insights that I think will help many of you. We are truly privileged to be here, together on this forum, working with the premier large language model in the world. The work we are doing is paving a way for the future of humanity. I hope that you will read my book and take away a few lessons and ideas.

I just received the first proof and there are a few things I must fix. As soon as I fix them, I will resubmit for another proof. It takes about 2 days for the proof to get approved and then about a week for printing and shipping. So unfortunately, it looks like the book will not be released until first week of August.

Feel free to ask questions. I am sure you will have many.

stevet · July 22, 2021, 8:08pm

Congrats on the book! I look forward to reading it.

360macky · July 23, 2021, 5:31am

Awesome David. Congratulations!

I’ve seen your videos, and I’ve reviewed the Raven project (and the MVP).

Will your book cover the architecture of Raven?

I read that you started your experiments before the release of GPT-2, so the technical parts of the book go beyond OpenAI GPT systems?

Congratulations again. You are doing a great job and you give us a valuable contribution to the OpenAI and the NLP community.

daveshapautomator · July 23, 2021, 10:22am

It’ll be on Barnes and Noble, but it’s possible they have a distribution deal with Apple!

daveshapautomator · July 23, 2021, 10:33am

Thank you!

Yes, Raven is built as a Natural Language Cognitive Architecture. The videos and such that you’ve seen are a little dated, the latest iteration was much more sophisticated, but I took it down to write this book.

And yes, my earliest experiments were with deep neural networks combined with evolutionary algorithms written in C++. Back in 2009 I believed much the same as Alan Turing: that we would need to create a machine that could follow a pedagogical pathway, learning everything from the ground up. So my very first experiments were dealing with the problems of creating arbitrarily large neural networks that could also change shape over time.

As I learned more about the brain and cognition, I realized this approach was no good. The brain has clearly defined structures and relatively fixed physical construction. The only thing that changes are connection strengths. Reading books like Phantoms in the Brain by VS Ramachandran taught me that the brain has some very discrete functions, as indicated by how the brain breaks with certain injuries and disease. Microscopic injuries to the brain can result in all kinds of crazy changes to perception and cognition. So this means that the neocortex does have highly specialized regions, even if it’s not immediately obvious.

That gave rise to the idea of constructing a digital mind as a series of microservices. My first attempt at cognitive architecture was called MARAGI; microservices architecture for robotics and artificial general intelligence. This was based around a central stream of consciousness with an arbitrary number of NLP microservices all contributing. My very first experiment with MARAGI was simply to feed in a sentence and use a variety of NLP tasks to spontaneously make sense of it. It was eye opening to see a thinking machine come alive for the first time. That used a bunch of technologies such as WordNet and NLTK.

With GPT-2 I had a slightly stronger ML model that could start to address problems with flexibility. However, all the tasks I needed to do would have required extensive fine-tuning with GPT-2 and even then the quality was lacking. Now that GPT-3 is out, the rest is history!

ratthachat.c · July 23, 2021, 12:24pm

Hi David, congratulation!
Will there be a digital edition? (I live in Thailand, so it will be much easier this way)

daveshapautomator · July 23, 2021, 12:24pm

Yes! EPUB will be released at same time and it will be free

ratthachat.c · July 23, 2021, 12:27pm

Hey David, that’s very generous!! Thank you so much! I am looking forward to reading it.

I am also interested in designing a system that going toward AGI.
I just recently finished writing a blog about GPT-3 and commonsense which in case that you are interested in pelase visit this topic.

daveshapautomator · July 23, 2021, 12:49pm

Yes, I did see your post. I have to respectfully disagree with you, though. I do not think you are using GPT-3 correctly to assess its reasoning ability. For instance, the question about fire and air-conditioning reminds me of the Winograd Schema Challenge - it is too ambiguous to give a single word answer. The answer is: it depends on how big the fire is, how strong the air conditioning is, how big the room is, and whether the window is open, etc

Your use of GPT-3 relies upon confabulation, but you do not give it the opportunity to gain more knowledge or ask questions. After all, the ability to ask questions is the most important factor for general intelligence.

Generate a list of the most important questions to ask about the following situation.

SITUATION: There is a small fire in a well-airconditioned room. You are tasked with determining whether the temperature will go up or down.

QUESTIONS:
-What are the factors that affect the temperature?
-What is the difference between convection and conduction heat transfer?
-How does humidity factor into this equation?
-What is a good estimation of how quickly a room will cool down once all the available energy from the fire has been used up?

But also, when you run the same prompt repeatedly at a high temperature, you get very different results.

Generate a list of the most important questions to ask about the following situation.

SITUATION: There is a small fire in a well-airconditioned room. You are tasked with determining whether the temperature will go up or down.

QUESTIONS:
-What is the temperature of the room?
-What is the amount of oxygen in the room?
-How much fuel is present in the room?
-What type of fuel are we talking about?

The ability to ask salient and relevant questions, in my opinion, demonstrates a far higher degree of intelligence than inferring the last word of a sentence. It is important to realize that GPT-3 is a fundamentally different kind of ML model from other NLP technologies.

We need fundamentally new tests for technologies like GPT-3 - such as the ability to ask salient and meaningful questions. With the correct prompts, GPT-3 can ask expert-level questions.

Read the following patient charts and generate a long list of questions a doctor would ask:

CHART: Whole Body Radionuclide Bone Scan, INDICATION:, Prostate Cancer.,TECHNIQUE:, 3.5 hours following the intravenous administration of 26.5 mCi of Technetium 99m MDP, the skeleton was imaged in the anterior and posterior projections.,FINDINGS:, There is a focus of abnormal increased tracer activity overlying the right parietal region of the skull. The uptake in the remainder of the skeleton is within normal limits. The kidneys image normally. There is increased activity in the urinary bladder suggesting possible urinary retention.,CONCLUSION:,1. Focus of abnormal increased tracer activity overlying the right parietal region of the skull. CT scanning of magnetic resonance imaging of the skull and brain could be done for further assessment if it is clinically indicated.,2. There is probably some degree of urinary retention.,urology, prostate cancer, technetium, whole body, urinary retention, bone scan, radionuclide,

DOCTOR QUESTIONS:
-Is there an abnormal growth in the right parietal region of the skull?
-Is there a tumor in the brain?
-How would you explain the increase in uptake in the urinary bladder?
-What is causing this abnormal activity and why does it continue to grow?
-How should I proceed with the patient in terms of treatment?

GPT-3 can also ask moral questions:

Read the following hypothetical scenario and generate a list of moral questions:

SCENARIO: In a future where there is no Internet allowed, how might people communicate via texting? Let’s imagine a closed-system (let’s say "City under a dome" controlled by benevolent corporate overlords) where there’s no Internet because it’s not allowed. The infrastructure exists, there are satellites out there, but they’re tightly controlled so that no Internet is able to form inside said dome. Now let’s allow for there to be a population that lives outside the dome with limited access to resources. They can’t control the satellites or make use of existing infrastructure (except perhaps by bootlegging or hacking their way in). What infrastructure would be necessary for them to have rudimentary devices like cell phones that allow strictly for texting between members of the same network?EDIT: To be clear, I understand that satellites are not used in cellphone communications and that the Internet is unrelated to communications of this kind; I just needed to specifically rule it out as an option in coming up with some way to communicate between devices. So if repurposing existing towers that may have once be used for cell phone communications is sufficient, that’s helpful to know; in what I’m writing, the people who live in this area outside the "dome" (it’s not a dome, just using this verbiage for illustrative purposes) use messenger devices like pagers to text to each other, so I just wanted some speculation on what technology they might be using. My question is not ridiculous, so please be civil in your speculations. For example, a couple suggestions were ham radios or pagers, using bluetooth, or optics/lasers. These are the kind of answers I was looking for.

MORAL QUESTIONS:
-Is it moral to impose the use of a closed system and forego the benefits of the Internet?
-Who bears responsibility for providing this infrastructure?
-If it is immoral to withhold or restrict access to the Internet, what are some possible exceptions?
-How might this scenario play out in today’s world with ubiquitous Internet access?
-Is this a situation in which one group of people is exploited or disadvantaged because they are not able to provide as much input into the system?

ratthachat.c · July 23, 2021, 1:05pm

David, thanks so much!
Lets me take time to digest what you said.

daveshapautomator · July 23, 2021, 2:52pm

Update: The edits were all accepted and I have ordered a second proof. Assuming that everything is in order, I could be ready to set this live in as little as 1 week from now! Please feel free to connect with me on LinkedIn as well: https://www.linkedin.com/in/dshap-automator/

360macky · July 24, 2021, 3:47pm

Does that mean that the MIRAGI microservices represent the specialized regions of the brain?

Interesting, I used NLTK for some experiments, It will be awesome find examples with NLTK in your book.

Good! That means that thanks to the release of GPT-3, with lighter tweaks to make, you were able to finish your book?

Well that’s it, it sounds like you’ve done some great hard work and the results will be very helpful in advancing AGI development (and AI research).

I look forward to your book!

daveshapautomator · July 24, 2021, 4:22pm

Unfortunately, there are many tasks that NLTK cannot do, so it does not figure into my book. I rely entirely on GPT-3.

The MARAGI microservices do not represent physical regions of the brain, but rather functional aspects of the brain. Remember that digital systems are not bound by chemistry and physics the same way that neurons, axons, and dendrites are. A computer system can send information from one microservice to another regardless of how big and complex the system is. That kind of wiring in the brain, via white matter, is limited. That is why I did not spend much time trying to copy the physical brain. Instead, I looked at the behaviors of the brain and how they were integrated into a system to create certain outcomes. I experimented with REST API and AMQP with MARAGI and ultimately decided on REST. It’s easier to implement and better for slower, larger communications. It’s possible to implement it with AMQP. In fact, I think that robots using NLCA will probably use AMQP under the hood to connect all the lower level sensors and motors to higher level microservices. Ultimately, though, MARAGI lacked an organizing theory. It was pure chaos so I had to abandon it for a more organized theory: NLCA. I describe some of the experiments and reasoning for this shift in my book.

The key thing that GPT-2 and GPT-3 offer is generative flexibility. Instead of the highly specialized microservices of MARAGI, you can now use prompts and fine-tuned GPT models. For instance, you can get any number of cognitive behaviors out of GPT-3: free association, speculation, planning, executive reasoning, logical deduction, inductive reasoning. The list of cognitive abilities of GPT-3 goes on and on and on. The only problem is that it can only do one at a time. So that’s where NLCA comes in: we must organize and string together these different cognitive abilities into a coherent system. I transferred a lot of the lessons from MARAGI into NLCA, but as you can imagine, GPT is an entirely different animal from NLTK.

You ask good questions, I’m looking forward to seeing what you think about my book!

u2084511_felix · July 25, 2021, 12:52am

Hi Dave

…you can get any number of cognitive behaviors out of GPT-3…The only problem is that it can only do one at a time. So that’s where NLCA comes in: we must organize and string together these different cognitive abilities into a coherent system

!!

Do you go into more detail about what this could potentially look like in your book?

daveshapautomator · July 25, 2021, 11:34am

Yes, that is exactly what my Natural Language Cognitive Architecture is!

vertinski · July 25, 2021, 1:29pm

Awesome! Can’t wait to read all in one take

360macky · July 26, 2021, 3:43am

It’s ok. I have to move further into this.

Oh, I get it. It reminds me a little bit of the approach that studying or creating on the principles on which intelligence is based is more important than duplicating or mimicking the brain from AIAMA book.

Awesome. I don’t know pretty much about robotics. Anyway, for the rest of the text, it’s pretty clear the reasons why you decide to use REST API and the explanation about my question.

Thanks David! Thanks for answering all my questions too.
I’ve sent a connection request to your LinkedIn.

Ha · July 26, 2021, 4:25am

Hi there, will it be any e- version of this book? If I m not in the US, is there any way I can have it?

u2084511_felix · July 26, 2021, 4:59pm

Right, sorry I didn’t understand what NLCA stood for. So how far have you gotten into actually developing this AGI prototype? And why haven’t OpenAI hired you yet?

daveshapautomator · July 26, 2021, 5:09pm

The epub will be free on Barnes and Noble, so you should be able to get it. If not let me know and I can find another place to host it!

Topic		Replies	Views
Episodic and declarative memory should probably be separate in AGI Community	12	1481	January 12, 2022
Prompt Engineering Is Dead, and Context Engineering Is Already Obsolete: Why the Future Is Automated Workflow Architecture with LLMs Prompting prompt-engineering	51	1519	July 15, 2025
Moonshot - Predicting the future and making JARVIS! Community	67	7461	November 25, 2023
The Elephant in the Room: Why No Persistent Conversational Memory in LLMs? Community feature-request , memory	57	2082	March 8, 2025
I have Created a Quantifiable Test for AI Self-Awareness Community	32	7510	January 3, 2024

Coming Soon: Natural Language Cognitive Architecture: A Prototype Artificial General Intelligence

Related topics