Personally, I think if instead of tokens based on word parts, gpt3 would operate directly with embedded vectors of words/word combinations and do so not linearly but in multy-dimensional space, than it would be definitely closer to how human thoughts are formed and flowing. But no one really knows until we test.
thats an interesting thought. makes total sense to me
Yes, sort of an “instant flash” of concept associations in a form of a tree with branches in all directions (vectors of simple concepts/words/words combinations), then model selects flashed vectors (using weights) to construct a “selected” path/root and then it transforms those vectors into a final thought (linear this time) which gets decoded into language… I think it’s doable at the stage where we are technically. It’s not my primary domain, so cannot do it. But a serious team could definitely try the concept (c) and build something that would work like this. Maybe just no one could turn this idea into words until now? That would be a beast .
I have even a name for that:
Finsler’s Large Associations Structure Hypothesizer
(c) Serge Liatko 2022
sounds great to me :–))
Diffusion language models – Sander Dieleman the same idea in a more scientific language…