I have some data in tables that may have 3 or more columns. Such as Name|DOB|City|Zip. How should I go about creating embedding for such data? Should I create embedding for each table row with header as below:
Name|DOB|City|State: Sam Walker|1/1/1997|Paducah|KY
or something else?
Try this:
Column name: row value
For example, if you have columns Name and Date:
Name: Brian
Date: 2023-03-23
2 Likes
In this case, how would it establish relationship between two data points? For example:
Name: Tom
DOB: 01/01/01
Name: Mot
DOB: 10/10/10
If I ask GPT, “what’s DOB for MOT”, how would it know?
Just include a piece of meta data as part of the embedding, like “stats for person 1: Name: Tom, DOB…etc.” GPT won’t have any trouble interpreting that. Assign a unique ID to each person.
1 Like