Help me understand how GPT-3 works

Hi! I have absolutely no knowledge about AI whatsoever, except that it will become a massive player in the years and decades to come. I have been playing around with the examples, specifically “Explain to a 2nd grader”, and I have been absolutely astonished by the outputs. How does this work? Does it search the internet for keywords you have inputted? Does it have a database? How does it put all the parts of the question you ask together? Also, what does the “stop sequences” module do. Thanks to anyone who replies!

Welcome to the community @PeterLemur!

Simply put, GPT-3 is a language model, and given an input, you get an output. That output depends on how GPT-3 was trained.

How does this work? Does it search the internet for keywords you have inputted?

GPT-3 is a transformer language model that tokenizes your input, and given training data, responds how it was taught. The training data in question was from huge datasets acquired on the Internet. It does not have live access to search the Internet (except for the Codex model). If I remember correctly, all GPT-3 models besides Codex were trained off Internet data up to 2019.

Does it have a database?

GPT-3 itself resides on supercomputers that are maintained by OpenAI. I didn’t quite know if GPT-3 accesses a database separately, so I asked it that question and it said yes. If someone who knows more about how GPT-3 works and believes this is wrong, however, please inform me so I can change this!

What does the “stop sequences” module do

The stop sequence allows GPT-3 to provide an output and then prepare for your next input. If you don’t have a stop sequence and you give it a certain amount of tokens that is allowed to generate a response, it will use the tokens allocated to it and continue to provide an answer, even though it may not necessarily need to. This will help you save tokens and allow GPT-3 to provide you with a clear-cut answer without running on and providing extra information.

1 Like

Thank you! Much appreciated.

1 Like

I like most the RASA series however the other ones also has good content.
GPT-X is just a fancy series of calculations using a very large matrix dataset.

LSTM is dead Long Live Transformers! - YouTube

Transformers — transformers 4.10.1 documentation

Rasa Algorithm Whiteboard - Transformers & Attention 1: Self Attention

Rasa Algorithm Whiteboard - Transformers & Attention 2: Keys, Values, Queries