GPT-3 As a Tool for AI-Safety

I recently wrote a paper (AIs Can (Already) Help Us with AI Safety) for a summer AI course I was taking on the usage of GPT-3 to help us come up with ways to solve AI Safety problems. The TL;DR - it’s surprisingly good!

Would be curious to know if anyone else has experimented with GPT-3’s ability to propose solutions to AI Safety concerns.

Not sure if your intent is to get creative suggestions from the AI or use it to solve safety issues.

If the latter, I used GPT-2/3 as a tool for generating training data for a safeguarding classifier for online messages. There is a paucity of data available for topics such as suicide ideation, abuse, self harm, eating disorders, etc. Social platforms try to remove these messages in order to try and both protect the authors and to stop amplification of the subject matters. Synthetic data generation proved to be a very powerful way of expanding a small set of data into a larger one that could then be used to train strong classifiers.


I’m interested in both! I think my paper is about the former, but it’s wonderful to hear about the latter as well! It makes sense that GPT-3 could be used quite well to expand a small text dataset where the pattern is something fairly clear in terms of the meanings of the words. Fascinating! Thank you for sharing.

1 Like