5th July: Knowledge Graphs
Summary of the Second Round Table. 05.07.2022:
The second roundtable session focused on knowledge graphs and began with an introduction by Mirco Schönfeld to the vocabulary of and surrounding them. The first thing to note was that knowledge graphs do not have a singular definition. Their function, however, is the representation of knowledge. They capture knowledge as ‘Subject, Verb and Object’ and thus show interconnections in a scheme. When knowledge is connected in bigger knowledge graphs, instances are connected to ontological concepts, therefore showing the relations of certain instances.
Knowledge graphs have many possible different elements, some of these are: Concepts, classes, and properties. Some characteristics that knowledge graphs tend to have, are:
- combination of ontological knowledge + concrete instances
- network as form of connection
- inherent logic of Knowledge Graphs
- complex, but non-hierarchical
- allow connection to external data (external ontologies, external knowledge graphs)
When going from knowledge graphs to linked open data, relations are replaced with URLs. Linked open data here is defined as links to other data on the internet. This means that we can connect data in our knowledge graphs to already existing data on the internet. An example of this would be: That there is underlying data. Entities have thus to be found in document texts. Afterwards a knowledge graph is created. This connects the entities to already existing ontologies. These existing ontologies can then be connected to further entities as well.
The second aspect discussed, brought into the round by Sulayman Sowe, was that of the location the knowledge graphs take on in WissKI. The current project is thus to transfer geolocation data from WikiData to WissKI. The collection of this data took a long time in the first place and some of the data was missing after it had been transferred. Additionally, the query was too bulky to transfer all the data. The solution for this problem was a query of states. The problem here however was that we needed to define which number represented which state. The solution to this issue was to use the identifier of WikiData of each state and then connect the geolocations to that identifier. The goal for this aspect is therefore to upload all the geolocation data, categorize the data according to continents and create a possibility for user input for new geolocations, in case a geolocation does not yet exist or is not found. Overall geolocations without a name need to be named and unknown places need to be labelled.
We need your consent to load content.
In order to show videos we use the services of third-party providers. These providers can collect data about your activities. Further details can be found under “Learn more”.Lastly the semantic web, which represents the meaning of a specific object at a specific time and place was brought into the discussion by Wynand van der Walt. In order for it to work a good knowledge structure as well as grammatical rules to make it fluid are needed. For this the readability of linked data must be considered. Definitions in linked data elements in MODS are:
- Creator: multiple types on creators (Narrators, Recorders, etc.), multiple authority files
- role term: authority file
- Genre: many different genre opportunities
- Language: a lot of African languages are not contained in ISO 639 series
- Subject: subject might contain codes (nor readable for humans)
- Can be expanded, what works best?
- Challenge going forward: Data standard need to be decided from the start
A comment by Anke Schürer-Ries at the end of the second session critiqued some of the vocabulary used, since it was very computer science related might not be understandable for every scientist. Thus, definitions for the researchers are necessary.