Processing Collections of Documents into Idea- and Concept-centric Encyclopedic Outputs


Hello. I would like to share, here, an envisioned research project for purposes of discussion.

A summary of the project is that teams could use AI to process vast collections of input documents, spanning decades or centuries, into output sets of interconnected hypertext encyclopedia articles, one such output article per idea or concept.

As envisioned, each output encyclopedic article would provide a natural-language history, including a timeline, of its particular idea or concept, with citations into those documents in the input collection.

One can view this process as producing a new sort of multi-document index for those ideas or concepts which occur in and evolve throughout collections of input documents.

Important lexemes, e.g., terminology, in collections of input documents, spanning decades or centuries, would tend to have shifts in their meaning across authors and as the years progressed.

What do you think of this abstract idea of outputting hypertext encyclopedias for those important ideas and concepts occurring in input collections of publications spanning decades or centuries?