Discussion thread for "Foundational must read GPT/LLM papers"

Yup, a lot of this is going to come to down neuromorphic study, it could well be that the brain just duplicates very complex neurons because that’s an easy evolutionary path but it carries no computational advantage.

One thing that I think might be super interesting to explore is the use of a liquid NN to set attention for a traditional transformer… maybe even dynamically allocating attention, not sure how you build the bridge from text to liquid NN’s though… If I were a 20 something again, I think I’d pick this area to go into.