[VIDEO] Implementing Core Objective Functions for safe, benevolent, and trustworthy AGI

daveshapautomator · October 23, 2021, 2:17pm

New video is up! I have achieved early success with my Core Objective Functions (COF) fine-tuning project.

The purpose of this is to create safe, trustworthy, and benevolent AGI. In this video, I describe the COF, how they are implemented, and how they will result in benevolent AGI. Furthermore, I give a demonstration!

How can you use this?

If you’re trying to make an open-ended chatbot, the COF can help with high-stakes cases. For instance, the COF will intrinsically disagree with destructive desires, such as building or acquiring weapons.

This initial dataset is somewhat limited though, and it needs a lot of help. I will be continuing to improve the dataset over time so that it will be more reliable and more useful.

The way you integrate this model is to incorporate the output of the COF model into the chatbot’s internal thoughts. I call this the “corpus” in my book. By adding the COF to the corpus, the chatbot (or AGI) will be pushed towards prosocial behavior and decisions. Then, over time, as more good data is added to the training data, the prosocial behavior will be better and more reliable.

Here’s my book: https://www.davidkshapiro.com/nlca
Here’s the repo: GitHub - daveshap/CoreObjectiveFunctions: The Core Objective Functions are the solution to the Control Problem. They will result in a benevolent and trustworthy AGI.

Topic		Replies	Views
[finetuning] Latest video about finetuning and Core Objective Functions is up! Community	2	728	December 24, 2023
[BOOK] Benevolent By Design is NOW AVAILABLE Community	4	987	April 20, 2022
Now Available: Natural Language Cognitive Architecture Community	8	1334	August 30, 2021
RAVEN demonstrated low level functional sentience Community	3	1475	January 24, 2023
New Instruct series V3 model is dangerously fast API	2	588	January 4, 2024

[VIDEO] Implementing Core Objective Functions for safe, benevolent, and trustworthy AGI

How can you use this?

Related topics