Arvin Ash: "How Does ChatGPT Actually Work? Behind the Scenes" - Introduction by ChatGPT

The intention of this text is to add some knowledge for the OpenAI model users. If you think it has some scientific or technical value for your research or work, please use it with discretion. The following comment and replies were posted:

How Does ChatGPT Actually Work? Behind the Scenes

Arvin Ash
YouTube channel - Apr 08, 2023
Video description: “SUMMARY
ChatGPT is an intelligent chatbot that uses natural language processing. The GPT stands for Generative Pre-trained Transformer, which means it generates responses, it is pre-trained by humans, and it transforms input data into an output. This model was created by an artificial intelligence research company called OpenAI.

ChatGPT’s power is the ability to interpret the context and meaning of a query and produce a relevant answer in grammatically correct and natural language, based on the information that it has been trained on.

It uses neural networking, with supervised learning and reinforcement learning, two key components of modern machine learning. What it does fundamentally is predict what words, phrases and sentences are likely to be associated with the input made. It then chooses the words and sentences that it deems most likely to be associated with the input. So it attempts to understand your prompt and then output words and sentences that it predicts will best answer your question, based on the data it was trained on.

It also randomizes some outputs so that the answers you get for the same input, will often be different. How ChatGPT fundamentally works, is that it tries to determine what words would most likely be expected after having learned how your input compares to words written on billions of webpages, books, and other data that it has been trained on.

But it’s not like the predictive text on your phone that’s just guessing what the word will be based on the letters it sees. ChatGPT attempts to create fully coherent sentences as a response to any input. And it doesn’t just stop at the sentence level. It’s generating sentences and even paragraphs that could follow your input.

If you ask it complete this sentence, “Quantum mechanics is…” – The processing that happens behind the scenes goes something like this: It calculates from all the instances of this text, what word comes next, and at what fraction of the time. It doesn’t look literally at text, but it looks for matches in context and meaning.

The end result is that it produces a ranked list of words that might follow, together with their “probabilities.” So it’s calculations might produce something like this for the next word that would follow after the word “is”:

a 4.5%
based 3.8%
fundamentally 3.5%
described 3.2%
many 0.7%
It chooses the next word based on this tanking.

But the sentence completion model is not enough, because you might ask it to do something where that strategy might not be appropriate.

In the first stage of the training process, Human contractors play the role of both a user and the ideal chatbot. Each training consists of a conversation with the goal of training the model to have human-like conversations.

Through this supervised human-taught process, it learns to come up with an output that is more than just sentence completion. It learns patterns about the context and meaning of various inputs so that it can respond appropriately.

But human training has scale limitations. Human trainers could not possibly anticipate all the questions that could ever be asked. For this it uses a third step which is called reinforcement learning. This is a type of unsupervised learning. This process trains the model where no specific output is associated with any given input.

Instead the model is trained to learn the underlying context and patterns in the input data based on its earlier human-taught pretraining.
This way the model can process a huge amount of data from various sources, and learn the patterns from texts and sentences of a near limitless number of subjects. The dataset used to train ChatGPT which is based on GPT-3.5 is about 45 terabytes of data.”


Thanks for the video, Arvin. More than just an interesting video for your audience - the follow-up video is NECESSARY. You already made this video the best first-step basics on the Large Language Machine models - it is a good introduction in the middle of this “darkness-is-full-of-terrors” of misunderstandings that contaminated not only YouTube but is spreading all around the world with reactions against this kind of AI. It is necessary to dissipate fears. Please keep going.

You are the first to clarify the distinction between ChatGPT and the other completion models offered by the same company OpenAI - where ChatGPT is a kind of “advertising boy” for personal use and fluent natural language with no settings adjusted by users. Meanwhile, the completion models are “more serious” AI engines for scientific and business usage. The completion models have some of their (hyper) parameters customized but demand more experience to make prompts (questions or requests made by the users) to provide responses more useful.

The “randomness” you referred to is called Temperature in ChatGPT is fixed at 0.7 - I prefer to say “70% of creative freedom”. The Temperature can be set by the user on completion machines like the Penalties to prevent repetitive answers when the Temperature is set too low.

There is a confusion that this Temperature set on high is the cause of Bing AI (a GPT-4 model) aggressivity - but that is not true. Human interaction is a separate training process - usually made by owners and designers of the model - to make responses in some way - and Temperature has nothing to do with it.

So as you go a bit deeper into this understanding, you will see how much is necessary for a follow-up video to enlighten all of us. Thank you so much.

1 Like