Question about training the chat, and what info it retains

I’m a total and complete newbie who is at that “I don’t know what I don’t even know” stage, so please feel free to make fun of me (provided my question is answered).

In my question I’ll probably make incorrect assertions, so please feel free to correct these…

Here’s an assertion: when I start a new chat, I start training the chat. I provide it information to go on and a question in which it will respond. I hit “Enter” and the chat responds.

To this response, I can give give feedback via “regenerate”, “bad response”, and probably simply text. This feedback further trains the chat. As this process of question-response-question continues, the more the chat is trained.

If I feel that I’ve trained a chat well, I can save the chat and return to it later, continuing the process towards the Singularity.

Based on those assertions, here’s my question: when I start a new chat, is any of the training from other chats included in that new chat, or does it start from scratch?

You have a good question about those downvotes. There’s no manual for what everything means.

The thumbs down button, or voting on whether the conversation is good, doesn’t affect the quality of the AI responses. It only gives feedback to OpenAI. Your input may be used for future model training in the distant future after going through other steps of knowledge worker good/bad ratings.

Regenerate will just give you another response for the same input. ChatGPT is tuned so it almost never gives exactly the same answer.

A chat session is independent from others. What the AI learns is simply what it sees from earlier conversation in a particular chat (which has limited visibility to the AI after the scrolling conversation grows long).

So if you start your session “you now act like a pirate, so that you reproduce an immersive interactive chat experience between me and an authentic swashbuckler.”, that AI will say GARRR. Then in the left conversation selector, you can return to other past chats where you were doing other work.

I’m thinking of each chat as a discrete entity that needs to be trained.

IS this the wrong way to think of it?

Maybe I should think of the AI as a whole, and NOT break it up into “trainable chats”. Also… how much of what I do REALLY trains the AI? I’m sure it’s getting 99.99…% of its training data comes from “not me”, even if I’m giving it code to refactor.

Um, something I missed in your example: I tell the chat to talk like a pirate, and then I go to past chats to do stuff. When the AI responds inside of past chats, will it talk like a pirate?

The only training that happens is a months long process that builds a new model with a training corpus and knowledge. And then OpenAI periodically running different machine learning processes to overlay some new behaviors on that model (like saying “I’m sorry” to things unwanted).

The AI model is unchanged by your interactions with it.

Language AI generates the response that would follow an input prompt, a sequence of tokens or words. It predicts the next part of output to follow, a word at a time.

For example, if my input is “A yellow fruit is a”, the next word the bare AI might produce is " banana".

ChatGPT builds on that technology by placing messages in containers, and prompting the AI where it should write a response. The AI then predicts what a response to a question would be instead of just what the next words of a paragraph should be.

See if you can be an AI and figure out the next thing to output:

system programming: The chatbot is an expert in outdoor activities.
user: What’s the tallest mountain in the world?
chatbot: The tallest mountain in the world is Mount Everest, which is part of the Himalayas. Its peak is 8,848 meters (29,029 feet) above sea level.
user: How long does it typically take to climb it?
chatbot: Climbing Mount Everest typically takes about two months. Climbers need to acclimatize to the high altitude, so they spend weeks at base camps and make shorter ascents before attempting to reach the summit.
user: Are there any specific dangers associated with this climb?
chatbot: Yes, there are several dangers associated with climbing Mount Everest, including severe weather conditions, avalanches, crevasses, and the risk of altitude sickness. The “Death Zone” above 8,000 meters is particularly hazardous due to its thin air.
user: Has anyone ever climbed it without oxygen?
chatbot: (write your answer here)

That’s what will be sent to the stateless AI every time there is a new question: a record of the recent chat. That lets the chatbot understand what “it” is in the last question.

The AI hasn’t learned anything. It has been shown the prior conversation.

So when you switch conversations in ChatGPT, a different history of the recent chat is sent when you continue.

(example conversation generated completely by AI)

Fun game!

But… I don’t know what the chatbot’s response would be. I mean, based on the information given, there’s nothing that says that “anyone needs oxygen to survive”.

But can the AI make that inference?

What’s the answer? I’m super curious!

I’m going with this:

“I don’t know if anyone has ever climbed Mt Everest without oxygen, but the chances of them dying from this is very high.”

Fortunately the AI has been pretrained on a massive amount of knowledge, incomprehensibly more than one person could know. It is able to not just predict the next word you wanted to type on your phone, but can infer complete solutions to complex problems.

I paste the example chat into what is called a “completions” AI model (that doesn’t have the built in chat formatting of ChatGPT), and the AI itself learns from the previous exchanges what it is supposed to do next:

I wasn’t satisfied with the first answer that was a bit ambiguous - like something that you might write yourself if you didn’t really know but had to make up an answer. So asked another question also, where green is the most certain words the AI is predicting. (this is the gpt-3.5-turbo-instruct model)

Fact checking the AI:

The first two were the first to climb without oxygen - and last guy has climbed Everest ten times without oxygen…