I tried some experiments and was impressed that Language models are capable of vision even they may never saw any image before
I tried some experiments and was impressed that Language models are capable of vision even they may never saw any image before