I have some data in tables that may have 3 or more columns. Such as Name|DOB|City|Zip. How should I go about creating embedding for such data? Should I create embedding for each table row with header as below:
Name|DOB|City|State: Sam Walker|1/1/1997|Paducah|KY
or something else?
Try this:
Column name: row value
For example, if you have columns Name and Date:
Name: Brian
Date: 2023-03-23
2 Likes
In this case, how would it establish relationship between two data points? For example:
Name: Tom
DOB: 01/01/01
Name: Mot
DOB: 10/10/10
If I ask GPT, “what’s DOB for MOT”, how would it know?
Just include a piece of meta data as part of the embedding, like “stats for person 1: Name: Tom, DOB…etc.” GPT won’t have any trouble interpreting that. Assign a unique ID to each person.
1 Like
Asking the same question in a different context. Let’s say that I have a pdf file that may have multiple tables. Using Adobe API, I can extract the tables as Excel as well as JSON. How should I go about creating this table into embedding? The problem arise when there is heading and subheading in the table as shown in the image.
I used a simple CSV format for the file embedding.
Name,DateOfBirth,City,State
Jake Kirby,1987/05/02,Los Angeles,CA
Annete Madison,1974/07/11,San Francisco,CA
Mike Browning,1992/03/25,Hope,AR
Beth Drupal,1996/10/05,Atlanta,GA
Naomi Miyazaki,2002/12/04,Topeka,KS
Here is the result of the conversation:
user: what is the birthdate of beth drupal?
assistant: The birthdate of Beth Drupal is 1996/10/05.
user: where was she born?
assistant: I couldn't find the answer to that question.
user: what city and state is beth drupal born?
assistant: Beth Drupal was born in Atlanta, GA.
user: how about annete madison? what can you tell me about her?
assistant: I couldn't find the answer to that question.
user: tell me the birthdate of annete madison and what city and state she was born.
assistant: The birthdate of Annete Madison is 1974/07/11. However, I couldn't find information about the city and state she was born in.
user: When was Mike Browning born?
assistant: Mike Browning was born on 1992/03/25.
user: where is the birth place of Mike Browning?
assistant: I couldn't find the answer to that question.
user: what city and state is Mike Browning born?
assistant: Mike Browning was born in Hope, AR.
I just ran into this today. ChatGPT seems to have a hard time recognizing these headers, and the best solution I’ve found so far is to do some preprocessing on that table and combine the headers and sub-headers into one column.
i.e. instead of header being over subheader1 and subheader2, you have to combine them, and it becomes header-subheader1 and header-subheader2.
Let me know if you find anything more elegant!
1 Like