Positional encoding and implicit grammar

struebbe79 · February 25, 2023, 3:37pm

Hello,
I am a scientist in the computer linguistic field and I have a question how GPT-models process grammar. If I am right, grammar is solely processed due to positional encoding. Without positional encoding, one can change the order of the words in the input and one gets the same probabilities for the output.

But, to my knowledge, the positional encoding is realized by adding a small vector to the word embeddings, so that the semantics of the words only change slightly. If the operations on the input are linear, as the multiplication with the key and query matrices, the small added vector is preserved. But how does the model react on non-linear prcessing steps? Is the small vector then “absorbed” by the larger embedding vector?

Topic		Replies	Views
GPT without positional encoding API	4	897	April 17, 2023
How is grammar processed with a transformer? API	0	264	March 8, 2023
Positional encoding and changed order of input tokens API	0	238	January 26, 2024
Help me understand how GPT-3 works API	3	734	September 16, 2021
Learning token embeddings API codex	0	540	May 10, 2022

Positional encoding and implicit grammar

Related topics