I think building knowledge taxonomy and ontologies are critical before even jumping on to develop RAG based applications. Without that, identifying the needle in the haystack is almost impossible. There are so many context overlaps in our unstructured knowledge content. I would be keen to hear if anyone has started to use knowledge ontologies. Some examples of experience will be highly appreciated.
In the context of LAWXER (https://www.lawxer.ai) spent about 12 months building the theory behind the unstructured text analysis (general approach) and framework for fact extraction and requirement compliance controls.
While not implemented in the app logic the knowledge ontology, it was often used to define the app logic.
In early 2025 we’ll be building the clauses library engine. And that beast will definitely require the proper implementation of this approach in the knowledge base. So, maybe will post something on the subject either on the website or (most likely) on my LinkedIn
I would say it depends. My approach was to replicate human approach to knowledge consumption and understanding which was largely enough to get the precision to the level I needed (which is way over the head of my current competitors).
The trick was to use multi-step data ingestion and multi-step data retrieval (similar to how we read and recall) backed up by the app workflows. And that fixed the semantic noise issue in RAG usage.
What I struggle to understand is, the transformer architecture enabled to get better associations between the words in a sentence, how would it help understand the meaning of the sentence based on a given knowledge domain. It is just associating words without understanding the meaning. I feel, providing the meaning of the domain through ontology and taxonomy may be able to reduce hallucination and better ground the LLMs
An ontology can be very useful, a taxonomy maybe not as much (as far as I understand ATM).
One question I asked myself was, what’s a taxonomy?
Nominally you’d think it’s a discretization of concepts into strata and taxa, but if you really dig down these discretizations don’t really represent the world because nothing’s ever “that simple and clear-cut” - and I’m gonna claim that any attempt at a finite taxonomy is inherently opinionated.
Disregarding that, if you were to crank up the resolution on your taxonomy graph you’d (theoretically) eventually end up with an infinite dimensional graphon (I’m overextending a bit tbh; I’ve never gotten that deep into GT) we can approximate (or learn) a lower dimensional analogue/approximation of that graphon with LLMs, and then you get your embedding hypersurfaces.
TL;DR: Embedding space is “a taxonomy” - and IMO a vastly superior one to any a human could ever define.
I disagree here - unless you’re trying to represent a concept that the model cannot grasp. In which case you might need to use a stronger model, or work on your formulation.
Which types of hallucinations do you want to mitigate? There’s three types off the top of my head: (in terms of embeddings)
- Vertical: The model cannot understand the content due to a lack of focus or attention. This can be overcome by using a better model improving data quality. (e.g. better chunking strategy, rewording, etc.)
- Horizontal Type A: (OoD/wrinkles) It’s possible for embeddings to sometimes land in regions that are “Out of Distribution” - but I think that’s pretty rare unless you do some arithmetic post - processing. The surface geometry in these OoD regions might not be as smooth as expected, which can bring about “hallucinations” where unrelated things might appear closer than expected. This might be overcome by picking a model that’s more suited to your data, but typically a bigger, more generally capable one is an even better option.
- Horizontal Type B: (OoS/overlap) It’s possible for certain dimensions to not be representable in embedding space, causing the space to fold in on itself and retrieving undesirable elements. These Out of Space “hallucinations” often need to be cleaned up with a post-processing step. A common manifestation of this is when the temporal axis becomes relevant. I’m not aware of a strong off the shelf model that can do this reliably.
I think a (human) taxonomy can potentially help you detect if your embedding space has horizontal type issues in your domain of interest. Turning it into a sort of validation test to see if the model is capable of representing what you want. I wouldn’t do it up front because I think the returns would be marginal, but if you got the time and energy it might be worth investigating.
An ontology otoh can help if you’re struggling with H/B overlap issues. Directional ontological links allow you to navigate between the overlapping layers/branes as it were.