Best way to interact with PDF 2025

Fascinating what you can get with single prompts now…

A 3D model of a wooden shelf with variously sized compartments is rotating on a black background. (Captioned by AI)

1 Like

Pretty neat. Reminds me of something I have on my todo list for my app, which is to be able to submit each piece of content (it’s a block-based CMS similar to how Jupyter Notebooks is block-based) to OpenAI chat completions API just to ask it to assign hashtags (for semantics) to the content, so that it can build up a kind of “semantic network” automatically where the user can sort of brows things kind of like a Mind Map.

EDIT: For consistency I’ll probably pre-define the set of hashtags, and so that request will really be for ChatGPT to select which tags are applicable.

yep… that’s what my system is going to provide… I mean you already see the mindmap there…

No need to build it… will be open source

Actually to label the images you can draw a so called feature on a map and save the result as geojson and create an svg from the image + lable data and then train a cnn on that to find that kind of stuff…

imagine this but without the map and instead you add the image as background image…

https://bohrloch.kreuzung1.de/

It may have just been a coincidence that you mentioned geographical maps (which was interesting btw), as a response to “mind maps”, but what I meant was this:

Anyway we’ve sort of hijacked this thread, which was supposed to be about dealing with PDFs. I need to search this forum to see where the discussions about GraphRAG are all happening.

Best way to interact with PDF 2025 - #41 by jochenschultz I was refereing to this - which is a tree or graph like a mindmap is a tree/graph as well.

The map was for

And I assumed you wanted to label images as well. I see anything as a document - and inside PDF you will find images, diagrams, floorplan drawings and whatnot…

So it absolutely makes sense to talk about a graph/tree when talking about pdf data extraction especially in the context of multi modal GraphRAGs

Right, I agree, I’ve been trying for years to convince people to treat documents as tree structures. :slight_smile:

https://clay-ferguson.github.io/quantizr/user-guide/#5fa1df7eb4bd3b0a753229da

I also say “Conversations” (whether it’s Social Media as human-to-human or discussions with an LLM Chat Bot) are Tree Structures too!! With a Tree-centric view of “AI Chats”, you can go back to any prior point in the conversation and “branch off”, by doing a different reply than what your initial reply was.

AWS textract does that as well.

I started converting html documents to a so called nested set back in 2003 which allowed for fast retrival from the trees and worked well for scraping static pages.

1 Like

Hey @jochenschultz , was really excited by this project - was looking for something exactly like this!
Has it been released as an open source repo somewhere?