I apologize if this is the wrong place to ask - the chatGPT discussion group is experiencing technical difficulties, and I can’t join. I am interested in 2 sorts of documentation: first, a technical report on chatGPT written for computer scientists, and second, a concise summary of the applications that have been established thus far. I know exactly how neural nets work - I don’t need that discussion (although it’s likely to be present in the type of report for which I am searching.) I just want details about the system in which chatGPT’s net is embedded.
chatGPT is just OpenAI’s website where you can interact with the various models they have.
If you want research on the specific models you’ll find that here:
Your keywords are “decoder-only transformer-based reinforcement learning from human feedback large language model AI with byte-pair encoding tokenizer and attention layers”. And GPT-2, because it is at least open-source and researchable.
Excellent! There are about 170 papers on the research index. I was hoping for a handful, and for the level of summarization that implies. It occurred to me that what I really want is a chatGPT textbook for a University course, but I bet this hasn’t happened yet.
Always happy to help!
Yeah, I started writing a teachers guide ~6 months ago, but I quickly realized that there isn’t enough well established, peer reviewed & published research to fill a course.
OK that’s what I figured. Did you by any chance establish any kind of reading list and/or bibliography? Or perhaps advice about where to start one’s journey into the field? Again, I know neural nets, so I don’t need that. I am particularly interested in the origins of the idea of “temperature” and “embedding”, because they are closely related to my own work, but I need information about the front-end and task management. Thanks a bunck - oops, cat-induced typo. . . .
If you already know about neural networks then you don’t need much, but I can recommend this:
Quick Lesson Plan for Understanding GPT
- Read the Paper: Start with the foundational paper “Attention Is All You Need” by Vaswani et al. to understand the core architecture behind GPT.
- Watch the Video: by Andrej Karpathy (one off the founding members of openAI) for a in debt understanding of how models like GPT work.
- Take the Course: “Prompt Engineering for Developers” (it’s free) by deeplearning.ai to learn how to interact effectively with GPT models.
This should give you a comprehensive understanding of GPT and how to use it.