GPT-4-vision extraction of tables with branched rows/vertically-merged cells

Azure231 · April 26, 2024, 5:42pm

I’m struggling with a specific use-case and wondering if anyone has any insight.

I have been using GPT-4-vision to extract data from medical test documents. There are a range of different formats.

GPT-4-vision does very well at identifying the correct table and extracting all of the data with the one key exception of branched rows.

Columns: Biomarker, Method, Analyte, Result

Unbranched row: 1 Biomarker, 1 Method, 1 Analyte, 1 Result (all values aligned horizontally)
Branched row: 1 Biomarker, 1 Method, 2 Analytes, 2 Results (Analyte and Result values are horizontally above and below Biomarker and Method).

(which rows are branched varies from document to document)

In these cases, the model usually extracts one of the branches (sometimes upper, sometimes lower) and ignores the other.

I’m going crazy trying to word a prompt to get it to extract both branches. Some of the things I have tried include:

Describing branched rows in every way I can think of (branched, shared, vertically-merged cells/values, etc.).
Telling it exactly which rows are branched (not feasible in production).
Telling it to extract only the 2 Analyte and 2 Result values for a specific biomarker.
Telling it to index the table by Result.
Telling it to output the results in different formats (csv, html).

In frustration:
4. Telling it to extract only the Result column, and exactly how many values there should be (skipped the branched values, used values from a later table to fill in the number).

Telling it to extract ALL WORDS in the image with no mention of a table (extracted all words except the branched values).

I’ve found examples online of extracting tables, but none with this sort of format. Has anyone found an approach for this?

anon10827405 · April 26, 2024, 6:11pm

Out of curiosity why are you using GPT-4V and not a typical OCR that’s built for tables?

Azure231 · April 26, 2024, 6:31pm

Actually, I currently do. This is just exploring alternatives.

anon10827405 · April 26, 2024, 6:47pm

The issue as you’ve found is that you are confined to only prompting…You could try to manually adjust the image itself or even find some consistent structures and automatically cut the tables out and then query them individually, but this process has already been accomplished using these table-OCR models.

In your tests what has made you lean more towards frustratingly prompting GPT-V?

Azure231 · April 26, 2024, 7:38pm

Yeah, I see what you mean. I don’t want to divert too much from the original topic because I am hoping there is a fix for this.

I wouldn’t say I’m leaning towards GPT yet, but the appeal of it over the OCR approaches I have used is its flexibility in finding the right data and avoiding the wrong data when I don’t know what the document looks like in advance.

janos · September 27, 2024, 9:25am

Hi, I have a similar use case, but the data is party handwritten.

Can you recommend the best working OCR tools for tables you have experience?

Thanks,
Janos

anon10827405 · September 27, 2024, 3:51pm

I wouldn’t be able to say which is the best for your use-case without seeing a decent amount of these documents. The general idea is straightforward though:

First I’d try to run a pre-built table OCR tool and see if it can manage them without adding complexity. If not, well.

If the table structure is consistent you an implement a multi-stage OCR.

Classify → Segment → OCR

There was a new paper released regarding OCRing documents. There is a nice demo you can try out that uses all the types of OCR necessary:

janos · October 9, 2024, 10:47am

Thank you Ronald, I’ll give it a try!

piam22 · March 8, 2025, 2:12pm

I also tried for complex tables with grouped rows and columns and gpt-4o vision struggles to extract tables.

piam22 · March 8, 2025, 2:12pm

@Azure231 any luck so far??

Topic		Replies	Views
GPT-4o-vision for extraction of complex tables Prompting gpt-4 , gpt-4o	0	150	March 8, 2025
Data points in tables and charts in images Prompting gpt-4	7	1656	April 17, 2025
GPT Assistant talks about their task or just posts an example instead of actually performing the task Prompting gpt-4	3	940	November 28, 2023
Assistance Required for Improving GPT's accuracy and consistency in generating responses for structured tabular data queries API	3	94	January 20, 2025
Spatial awareness of small tabled data. Best method? Prompting gpt-4 , o1	10	424	October 6, 2024

GPT-4-vision extraction of tables with branched rows/vertically-merged cells

Related topics