Knowledge Organization: Make Semantics explicit
The organization of knowledge on the basis of semantic knowledge models is a prerequisite for an efficient knowledge exchange. A well-known counter-example are individual folder systems or mind maps for the organization of files. This approach to knowledge organization only works at the individual level and is not scalable because it is full of implicit semantics that can only be understood by the author himself.
To organize knowledge well, we should therefore use established knowledge organization systems (KOS) to model the underlying semantic structure of a domain. Many of these methods have been developed by librarians to classify and catalog their collections, and this area has seen massive changes due to the spread of the Internet and other network technologies, leading to the convergence of classical methods of library science and from the web community.
When we talk about KOSs today, we primarily mean Networked Knowledge Organization Systems (NKOS). NKOS are systems of knowledge organization such as glossaries, authority files, taxonomies, thesauri and ontologies. These support the description, validation and retrieval of various data and information within organizations and beyond their boundaries.
Let’s take a closer look: Which KOS is best for which scenario? KOS differ mainly in their ability to express different types of knowledge building blocks. Here is a list of these building blocks and the corresponding KOS.
Building blocks |
Examples |
KOS |
Synonyms |
Emmental = Emmental cheese |
Glossary, synonym ring |
Handle |
Emmental (cheese) is not same as |
Authority file |
Hierarchical |
Emmental is a cow’s-milk cheese Cow’s-milk cheese is a cheese Emmental (valley) is part of Switzerland |
Taxonomy |
Associative |
Emmental cheese is related to cow’s milk Emmental cheese is related to Emmental (valley) |
Thesaurus |
Classes, |
Emmental is of class cow’s-milk cheese Cow’s-milk cheese is subclass of cheese Any cheese has exactly one country of origin |
Ontology |
The Simple Knowledge Organization System (SKOS), a widely used standard specified by the World Wide Web Consortium (W3C), combines numerous knowledge building blocks under one roof. Using SKOS, all knowledge from lines 1–4 can be expressed and linked to facts based on other ontologies.
Knowledge organization systems make the meaning of data or documents, i.e., their semantics, explicit and thus accessible, machine-readable and transferable. This is not the case when someone places files on their desktop computer in a folder called “Photos-CheeseCake-January-4711” or uses tags like “CheeseCake4711” to classify digital assets. Instead of developing and applying only personal, i.e., implicit semantics, that may still be understandable to the author, NKOS and ontologies take a systemic approach to knowledge organization.
Basic Principles of Semantic Knowledge Modeling
Semantic knowledge modeling is similar to the way people tend to construct their own models of the world. Every person, not just subject matter experts, organizes information according to these ten fundamental principles:
- Draw a distinction between all kinds of things: ‘This thing is not that thing.’
- Give things names: ‘This thing is a cheese called Emmental’ (some might call it Emmentaler or Swiss cheese, but it’s still the same thing).
- Create facts and relate things to each other: ‘Emmental is made with cow’s milk’, Cow’s milk is obtained from cows’, etc.
- Classify things: ‘This thing is a cheese, not a ham.’
- Create general facts and relate classes to each other: ‘Cheese is made from milk.’
- Use various languages for this; e.g., the above-mentioned fact in German is ‘Emmentaler wird aus Kuhmilch hergestellt’ (remember: the thing called ‘Kuhmilch’ is the same thing as the thing called ‘cow’s milk’—it’s just that the name or label for this thing that is different in different languages).
- Putting things into different contexts: this mechanism, called “framing” in the social sciences, helps to focus on the facts that are important in a particular situation or aspect. For example, as a nutritional scientist, you are more interested in facts about Emmental cheese compared to, for example, what a caterer would like to know. (With named graphs you can represent this additional context information and add another dimensionality to your knowledge graph.)
- If things with different URIs from the same graph are actually one and the same thing, merging them into one thing while keeping all triples is usually the best option. The URI of the deprecated thing must remain permanently in the system and from then on point to the URI of the newly merged thing.
- If things with different URIs contained in different (named) graphs actually seem to be one and the same thing, mapping (instead of merging) between these two things is usually the best option.
- Inferencing: generate new relationships (new facts) based on reasoning over existing triples (known facts).
Many of these steps are supported by software tools. Steps 7–10 in particular do not have to be processed manually by knowledge engineers, but are processed automatically in the background. As we will see, other tasks can also be partially automated, but it will by no means be possible to generate knowledge graphs fully automatically. If a provider claims to be able to do so, no knowledge graph will be generated, but a simpler model will be calculated, such as a co-occurrence network.
Read more: The Knowledge Graph Cookbook