Transformers for Tabular Data Representation

Hello. The following article was interesting to me and hopefully it is of some interest to the community.

I, for one, am interested in how we might input and output language, mathematics, charts, diagrams, figures, graphs, infographics, and tables to and from GPT-4. In these regards, I’m inspired by the forthcoming Copilot in Excel technology.

With respect to tabular data representations,

In the last few years, the natural language processing community has witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in tabular data, recent research efforts extend LMs by developing neural representations for structured data. In this article, we present a survey that analyzes these efforts. We first abstract the different systems according to a traditional machine learning pipeline in terms of training data, input representation, model training, and supported downstream tasks. For each aspect, we characterize and compare the proposed solutions. Finally, we discuss future work directions.