Creating embeddings of tabular data

I have some data in tables that may have 3 or more columns. Such as Name|DOB|City|Zip. How should I go about creating embedding for such data? Should I create embedding for each table row with header as below:
Name|DOB|City|State: Sam Walker|1/1/1997|Paducah|KY

or something else?

Try this:

Column name: row value

For example, if you have columns Name and Date:

Name: Brian
Date: 2023-03-23
2 Likes

In this case, how would it establish relationship between two data points? For example:
Name: Tom
DOB: 01/01/01
Name: Mot
DOB: 10/10/10
If I ask GPT, “what’s DOB for MOT”, how would it know?

Just include a piece of meta data as part of the embedding, like “stats for person 1: Name: Tom, DOB…etc.” GPT won’t have any trouble interpreting that. Assign a unique ID to each person.

1 Like