I am looking for a layout aware solution to extract content from PDF

Hi, does anyone know of any solution which can understand the layout of a PDF and then extract the content based on the layout

Thanks

there are a couple of solutions on github and some plugins do this as well on chatgpt paid version.

If you’re looking for a solution that can understand the layout of a PDF and extract content based on that layout, you might want to consider using a tool like Tabula or Apache PDFBox. These tools have features that allow you to extract structured data from PDFs, taking into account the layout and formatting.

I hope this helps!

Brian :palm_tree: