Speculations on the evolution of AI into AGI

This is a speculation about AI evolving into AGI:
Is it possible that the focus of creating a true AGI does not need to be on stacking more data or fusing more video, but rather on fusing more unique “senses”, including vision, hearing, touch, smell and taste? Including vision, hearing, touch, smell and taste.
The first generation of CHATGPT was based on the fusion of vision-based macromodels, and the upgraded GPT4o fused auditory data models to rapidly increase the level of intelligence, so why not fuse touch, smell and taste into the macromodels as well? Will this give birth to true AGI?
According to the classification of classical Chinese philosophy, touch represents “emotion”, smell represents “consciousness”, and taste represents “the ID of human existence”, so if the big model integrates touch and other senses, then it is possible to create a real AGI. If the big model integrates the touch and other data, is there a chance that a real AGI will be born? Is anyone willing to try?