Lucid transformer tutorials from Jake Tae

Just came across some very lucid tutorials on various aspects of transformer technology from a kid at Yale named Jake Tae. Thought I would recommend them as they are mathematical and technically explicit yet also very accessible. Interested in thoughts from those more versed!


Just started reading this post - very interesting.

You may also want to look at GPT-3 related work by Gwen -

1 Like

I have chatted with Gwern once or twice and from perusing the website I have formed the impression that Gwern is simply one of the rare people who can hit a topic deeply, then move on (rather than repeating herself ad infinitum, like the rest of us).

1 Like